This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/ProfileData/
-
llvm/
-
ProfileData/
5/5
SampleProfWriter.h
-
lib/ProfileData/
-
ProfileData/
2/7
SampleProfWriter.cpp
-
test/tools/llvm-profdata/
-
tools/
-
llvm-profdata/
1/2
output-size-limit.test
-
tools/llvm-profdata/
-
llvm-profdata/
9/11
llvm-profdata.cpp

Differential D139603

[llvm-profdata] Add option to cap profile output size
ClosedPublic

Authored by huangjd on Dec 7 2022, 8:23 PM.

Download Raw Diff

Details

Reviewers

davidxl
xur
kazu
ellis
gulfem
snehasish

Commits

rG5b72d0e4f5ee: [llvm-profdata] Add option to cap profile output size

Summary

Allow user to specify --output-size-limit=n to cap the size of generated profile to be strictly under n. Functions with the lowest total sample count are dropped first if necessary. Due to using a heuristic, excessive functions may be dropped to satisfy the size requirement

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

huangjd created this revision.Dec 7 2022, 8:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 8:23 PM

Herald added subscribers: wenlei, hiraditya. · View Herald Transcript

huangjd requested review of this revision.Dec 7 2022, 8:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 8:23 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

https://discourse.llvm.org/t/rfc-add-option-to-limit-llvm-profdata-profile-output-size/67036 for RFC

Add a test case for completeness.

llvm/lib/ProfileData/SampleProfWriter.cpp
631	unneeded change
639	unneeded change.
665	The clear call changes the behavior. Why is it needed?
llvm/tools/llvm-profdata/llvm-profdata.cpp
1075	Extract the following to a helper function.

Harbormaster completed remote builds in B201875: Diff 481146.Dec 8 2022, 6:27 AM

wenlei added inline comments.Dec 8 2022, 11:56 AM

llvm/tools/llvm-profdata/llvm-profdata.cpp
1087–1089	This heuristic can be quite inaccurate. The number of body sample + call site sample entries can be much better proxy for actual profile size and yet still easily accessible. `functionSamples.getBodySamples().size() + functionSamples.getCallsiteSamples().size()`

cc @hoy @wlei

huangjd added inline comments.Dec 14 2022, 4:47 PM

llvm/lib/ProfileData/SampleProfWriter.cpp
665	Name table should only contains names exist in the current ProfileMap. The original implementation adds to the name table when writing a new profile, and the old names are never cleared, which is actually a bug. (However SampleProfileWriter is single use, an instance never calls write twice, so this bug was not showing up until this new feature)
llvm/tools/llvm-profdata/llvm-profdata.cpp
1087–1089	The new revision performs the iterations on a string buffer so the performance of the heuristic is not a big concern (and now it should not overshoot by reducing too many functions). Although using a simple proportional heuristic is still too slow because it ends up reducing one function at the tail at a time

Refactored code structure
Use a string buffer to rewrite files
Added API for potential new strategy to reduce profile size

huangjd added inline comments.Dec 14 2022, 4:51 PM

llvm/tools/llvm-profdata/llvm-profdata.cpp
1087–1089	Better heuristics can be added in a later patch. Need more real world profile data (industrial use) to confirm which model is the best

Harbormaster completed remote builds in B203242: Diff 483034.Dec 14 2022, 6:50 PM

Added output size check

huangjd added reviewers: xur, kazu, ellis, gulfem.Dec 15 2022, 11:51 AM

huangjd added a reviewer: snehasish.Dec 15 2022, 1:29 PM

I have some high level questions:
(1) have you considered removing samples with smaller values across the program-- it's like downsampling. Comparing to removing functions with smaller total counts, I think that results in a more consistent profile.
(2) if we choose to remove function, and you sort the function with total count, should we find the exactly place to cut to satisfy the size limit? In theory it should as the profile organized in unit of function. You can keep writing to the buffer until it reaches the limits. Of cause there are some section data for extbinary and summary, but they should be able to compute. Using heuristic to guess the function to remove and doing it iteratively does not seem to be appealing here.

Harbormaster completed remote builds in B203422: Diff 483275.Dec 15 2022, 3:43 PM

The biggest challenge to compute the number of functions accurately is the compression in extbinary, because the compressed size is non-linear to the original size. Since profile samples and function names are written to different sections (and in CS profile the names are split into two sections and samples can also be split into two sections), there is no way to predict ahead the offset between them. Based on use cases, the current heuristic is under estimating how many functions to prune (and the last iteration typically converges to pruning 1 function) so it's unlikely to remove too many functions. (Note: Also tried using cubic equation for heuristic but that will remove too many functions, so the optimal heuristic is between O(n^2) and O(n^3))

As for down sampling, having a sample count of 0 vs not having a sample means differently to the compiler, so that may change the branch basic block placement on hot functions, not sure if good idea.

In D139603#4000123, @huangjd wrote:

The biggest challenge to compute the number of functions accurately is the compression in extbinary, because the compressed size is non-linear to the original size. Since profile samples and function names are written to different sections (and in CS profile the names are split into two sections and samples can also be split into two sections), there is no way to predict ahead the offset between them. Based on use cases, the current heuristic is under estimating how many functions to prune (and the last iteration typically converges to pruning 1 function) so it's unlikely to remove too many functions. (Note: Also tried using cubic equation for heuristic but that will remove too many functions, so the optimal heuristic is between O(n^2) and O(n^3))

I think it makes sense to start with straight-forward implementation and look for opportunities to refine it further in the future.

llvm/include/llvm/ProfileData/SampleProfWriter.h
135	It looks like derived classes must always call this function. Can you add a comment here for the future? Maybe something like `// This function must always be called by the overridden implementation`?
llvm/lib/ProfileData/SampleProfWriter.cpp
58	I think this should be ValueType based on the guidance here: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
llvm/test/tools/llvm-profdata/output-size-limit.test
6	I don't think we should use *-DAG here since that means the test will pass if the lines are reordered. However, if we reorder the symbol line with its contents the text format will be incorrect. I think the -DAG directive is useful if there is non-determinism in the text format where profile information for a whole symbol may appear before another symbol. In this case we should separate out the CHECKs like the example in [1]. Though we should fix non-deterministic output if this is the case. [1] https://llvm.org/docs/CommandGuide/FileCheck.html#the-check-dag-directive
llvm/tools/llvm-profdata/llvm-profdata.cpp
988	How about moving this to SampleProfileWriter so that we can use this API in other tooling where we don't invoke llvm-profdata?
1035	Perhaps wrap this in DEBUG(), I'm not sure whether it's useful to have this message all the time.
1084	An if statement with initialization would make the intent clearer here: if (EC = File.error(); EC) { }
1218	nit: Just heuristic instead of "heuristic algorithm"?

Updating D139603: [llvm-profdata] Add option to cap profile output size

huangjd marked an inline comment as done.Dec 28 2022, 4:37 PM

huangjd added inline comments.

llvm/include/llvm/ProfileData/SampleProfWriter.h
135	It is not called by derived classes. It is called by llvm-profdata if the writer needs to be reused

Harbormaster completed remote builds in B205112: Diff 485555.Dec 28 2022, 5:14 PM

snehasish added inline comments.Dec 28 2022, 5:24 PM

llvm/include/llvm/ProfileData/SampleProfWriter.h
135	My comment was to add some documentation for future implementations which derive from SampleProfileWriter. For example, in this patch `SampleProfileWriterText::reset` calls `SampleProfileWriter::reset(OS)` on L110. `SampleProfileWriterBinary::reset` calls `SampleProfileWriter::reset(OS)` on L142. I hope that clarifies the suggestion.

Add comment clarifying reset() usage

huangjd marked 8 inline comments as done.Dec 29 2022, 5:35 PM

Harbormaster completed remote builds in B205187: Diff 485653.Dec 29 2022, 6:20 PM

snehasish added inline comments.Jan 3 2023, 11:34 AM

llvm/tools/llvm-profdata/llvm-profdata.cpp
988	Re-opening since I think we could move the RewriteProfileSizeLimit and CalculateNumFunctionsToRemove method to SampleProfileWriter as well so that this heuristic (and subsequent updates to it) can be reused in internal tooling directly.

Refactor: moved implementation to llvm lib so that it can be used by other tools

Harbormaster completed remote builds in B205814: Diff 486443.Jan 4 2023, 7:26 PM

lgtm

Please wait a bit to see if others have additional comments. Thanks!

llvm/include/llvm/ProfileData/SampleProfWriter.h
45	Perhaps specify that this is in bytes either in the variable name or a comment?
llvm/tools/llvm-profdata/llvm-profdata.cpp
1216	I think this is generally useful and we should make it visible.

This revision is now accepted and ready to land.Jan 5 2023, 11:03 AM

Clarified comments

This revision was landed with ongoing or failed builds.Jan 9 2023, 2:01 PM

Closed by commit rG5b72d0e4f5ee: [llvm-profdata] Add option to cap profile output size (authored by huangjd). · Explain Why

This revision was automatically updated to reflect the committed changes.

huangjd marked an inline comment as done.

huangjd added a commit: rG5b72d0e4f5ee: [llvm-profdata] Add option to cap profile output size.

Harbormaster completed remote builds in B206621: Diff 487550.Jan 9 2023, 3:21 PM

This breaks tests on Mac: http://45.33.8.238/macm1/52334/step_11.txt

Please take a look and revert for now if it takes a while to fix.

Windows too: http://45.33.8.238/win/72992/step_11.txt

@huangjd the test you added seems to be failing on Windows. Can you take a look and revert if you need time to investigate?

https://lab.llvm.org/buildbot/#/builders/216/builds/15534

chapuni added a subscriber: chapuni.Jan 9 2023, 7:54 PM

chapuni added inline comments.

llvm/lib/ProfileData/SampleProfWriter.cpp
125	FYI, OriginalFunctionCount was unused but fixed in rG9f4a9d3f4450
127	IterationCount is used only here.

dyung added inline comments.Jan 9 2023, 8:00 PM

llvm/test/tools/llvm-profdata/output-size-limit.test
61	On Windows this seems to be expanded in a way you probably did not expect: (https://lab.llvm.org/buildbot/#/builders/216/builds/15553/steps/7/logs/FAIL__LLVM__output-size-limit_test) `"$(stat" "-c" "%s" "Z:\test\build\test\tools\llvm-profdata\Output\output-size-limit.test.tmp.output)"`

dyung added a reverting change: rGac07911b455e: Revert "[llvm-profdata] Add option to cap profile output size".Jan 9 2023, 11:54 PM

In D139603#4038245, @dyung wrote:

@huangjd the test you added seems to be failing on Windows. Can you take a look and revert if you need time to investigate?

https://lab.llvm.org/buildbot/#/builders/216/builds/15534

How to do size check in a cross platform way?

llvm/include/llvm/ProfileData/SampleProfWriter.h
45	Clarified comment in constructor
llvm/tools/llvm-profdata/llvm-profdata.cpp
988	CalculateNumFunctionToRemove is moved inside FunctionPruningStrategy since it won't be used anywhere else. It can be overriden if necessary

huangjd mentioned this in D141446: [llvm-profdata] Add option to cap profile output size.Jan 10 2023, 4:45 PM

huangjd mentioned this in rGc268f850a299: Fix to D139603(reverted) - moved size check to unit test so that it is cross….Jan 11 2023, 4:42 PM

vitalybuka mentioned this in rGc37694817a59: Revert "Fix to D139603(reverted) - moved size check to unit test so that it is….Jan 11 2023, 11:25 PM

huangjd mentioned this in rG48f163b889a8: [llvm-profdata] Add option to cap profile output size.Feb 7 2023, 6:19 PM

huangjd mentioned this in rG79971d0d771a: [llvm-profdata] Add option to cap profile output size.Feb 8 2023, 2:22 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

ProfileData/

SampleProfWriter.h

65 lines

lib/

ProfileData/

SampleProfWriter.cpp

134 lines

test/

tools/

llvm-profdata/

output-size-limit.test

119 lines

tools/

llvm-profdata/

llvm-profdata.cpp

21 lines

Diff 487551

llvm/include/llvm/ProfileData/SampleProfWriter.h

Show All 29 Lines	enum SectionLayout {
DefaultLayout,		DefaultLayout,
// The layout splits profile with context information from profile without		// The layout splits profile with context information from profile without
// context information. When Thinlto is enabled, ThinLTO postlink phase only		// context information. When Thinlto is enabled, ThinLTO postlink phase only
// has to load profile with context information and can skip the other part.		// has to load profile with context information and can skip the other part.
CtxSplitLayout,		CtxSplitLayout,
NumOfLayout,		NumOfLayout,
};		};

		/// When writing a profile with size limit, user may want to use a different
		/// strategy to reduce function count other than dropping functions with fewest
		/// samples first. In this case a class implementing the same interfaces should
		/// be provided to SampleProfileWriter::writeWithSizeLimit().
		class FunctionPruningStrategy {
		protected:
		SampleProfileMap &ProfileMap;
		size_t OutputSizeLimit;
		snehasishUnsubmitted Done Reply Inline Actions Perhaps specify that this is in bytes either in the variable name or a comment? snehasish: Perhaps specify that this is in bytes either in the variable name or a comment?
		huangjdAuthorUnsubmitted Done Reply Inline Actions Clarified comment in constructor huangjd: Clarified comment in constructor

		public:
		/// \p ProfileMap A reference to the original profile map. It will be modified
		/// by Erase().
		/// \p OutputSizeLimit Size limit in bytes of the output profile. This is
		/// necessary to estimate how many functions to remove.
		FunctionPruningStrategy(SampleProfileMap &ProfileMap, size_t OutputSizeLimit)
		: ProfileMap(ProfileMap), OutputSizeLimit(OutputSizeLimit) {}

		virtual ~FunctionPruningStrategy() = default;

		/// SampleProfileWriter::writeWithSizeLimit() calls this after every write
		/// iteration if the output size still exceeds the limit. This function
		/// should erase some functions from the profile map so that the writer tries
		/// to write the profile again with fewer functions. At least 1 entry from the
		/// profile map must be erased.
		///
		/// \p CurrentOutputSize Number of bytes in the output if current profile map
		/// is written.
		virtual void Erase(size_t CurrentOutputSize) = 0;
		};

		class DefaultFunctionPruningStrategy : public FunctionPruningStrategy {
		std::vector<NameFunctionSamples> SortedFunctions;

		public:
		DefaultFunctionPruningStrategy(SampleProfileMap &ProfileMap,
		size_t OutputSizeLimit);

		/// In this default implementation, functions with fewest samples are dropped
		/// first. Since the exact size of the output cannot be easily calculated due
		/// to compression, we use a heuristic to remove as many functions as
		/// necessary but not too many, aiming to minimize the number of write
		/// iterations.
		/// Empirically, functions with larger total sample count contain linearly
		/// more sample entries, meaning it takes linearly more space to write them.
		/// The cumulative length is therefore quadratic if all functions are sorted
		/// by total sample count.
		/// TODO: Find better heuristic.
		void Erase(size_t CurrentOutputSize) override;
		};

/// Sample-based profile writer. Base class.		/// Sample-based profile writer. Base class.
class SampleProfileWriter {		class SampleProfileWriter {
public:		public:
virtual ~SampleProfileWriter() = default;		virtual ~SampleProfileWriter() = default;

/// Write sample profiles in \p S.		/// Write sample profiles in \p S.
///		///
/// \returns status code of the file update operation.		/// \returns status code of the file update operation.
virtual std::error_code writeSample(const FunctionSamples &S) = 0;		virtual std::error_code writeSample(const FunctionSamples &S) = 0;

/// Write all the sample profiles in the given map of samples.		/// Write all the sample profiles in the given map of samples.
///		///
/// \returns status code of the file update operation.		/// \returns status code of the file update operation.
virtual std::error_code write(const SampleProfileMap &ProfileMap);		virtual std::error_code write(const SampleProfileMap &ProfileMap);

		/// Write sample profiles up to given size limit, using the pruning strategy
		/// to drop some functions if necessary.
		///
		/// \returns status code of the file update operation.
		template <typename FunctionPruningStrategy = DefaultFunctionPruningStrategy>
		std::error_code writeWithSizeLimit(SampleProfileMap &ProfileMap,
		size_t OutputSizeLimit) {
		FunctionPruningStrategy Strategy(ProfileMap, OutputSizeLimit);
		return writeWithSizeLimitInternal(ProfileMap, OutputSizeLimit, &Strategy);
		}

raw_ostream &getOutputStream() { return *OutputStream; }		raw_ostream &getOutputStream() { return *OutputStream; }

/// Profile writer factory.		/// Profile writer factory.
///		///
/// Create a new file writer based on the value of \p Format.		/// Create a new file writer based on the value of \p Format.
static ErrorOr<std::unique_ptr<SampleProfileWriter>>		static ErrorOr<std::unique_ptr<SampleProfileWriter>>
create(StringRef Filename, SampleProfileFormat Format);		create(StringRef Filename, SampleProfileFormat Format);

/// Create a new stream writer based on the value of \p Format.		/// Create a new stream writer based on the value of \p Format.
/// For testing.		/// For testing.
static ErrorOr<std::unique_ptr<SampleProfileWriter>>		static ErrorOr<std::unique_ptr<SampleProfileWriter>>
create(std::unique_ptr<raw_ostream> &OS, SampleProfileFormat Format);		create(std::unique_ptr<raw_ostream> &OS, SampleProfileFormat Format);

virtual void setProfileSymbolList(ProfileSymbolList *PSL) {}		virtual void setProfileSymbolList(ProfileSymbolList *PSL) {}
virtual void setToCompressAllSections() {}		virtual void setToCompressAllSections() {}
virtual void setUseMD5() {}		virtual void setUseMD5() {}
virtual void setPartialProfile() {}		virtual void setPartialProfile() {}
virtual void resetSecLayout(SectionLayout SL) {}		virtual void resetSecLayout(SectionLayout SL) {}

protected:		protected:
SampleProfileWriter(std::unique_ptr<raw_ostream> &OS)		SampleProfileWriter(std::unique_ptr<raw_ostream> &OS)
: OutputStream(std::move(OS)) {}		: OutputStream(std::move(OS)) {}
		snehasishUnsubmitted Done Reply Inline Actions It looks like derived classes must always call this function. Can you add a comment here for the future? Maybe something like `// This function must always be called by the overridden implementation`? snehasish: It looks like derived classes must always call this function. Can you add a comment here for…
		huangjdAuthorUnsubmitted Done Reply Inline Actions It is not called by derived classes. It is called by llvm-profdata if the writer needs to be reused huangjd: It is not called by derived classes. It is called by llvm-profdata if the writer needs to be…
		snehasishUnsubmitted Done Reply Inline Actions My comment was to add some documentation for future implementations which derive from SampleProfileWriter. For example, in this patch `SampleProfileWriterText::reset` calls `SampleProfileWriter::reset(OS)` on L110. `SampleProfileWriterBinary::reset` calls `SampleProfileWriter::reset(OS)` on L142. I hope that clarifies the suggestion. snehasish: My comment was to add some documentation for future implementations which derive from…

/// Write a file header for the profile file.		/// Write a file header for the profile file.
virtual std::error_code writeHeader(const SampleProfileMap &ProfileMap) = 0;		virtual std::error_code writeHeader(const SampleProfileMap &ProfileMap) = 0;

// Write function profiles to the profile file.		// Write function profiles to the profile file.
virtual std::error_code writeFuncProfiles(const SampleProfileMap &ProfileMap);		virtual std::error_code writeFuncProfiles(const SampleProfileMap &ProfileMap);

		std::error_code writeWithSizeLimitInternal(SampleProfileMap &ProfileMap,
		size_t OutputSizeLimit,
		FunctionPruningStrategy *Strategy);

/// Output stream where to emit the profile to.		/// Output stream where to emit the profile to.
std::unique_ptr<raw_ostream> OutputStream;		std::unique_ptr<raw_ostream> OutputStream;

/// Profile summary.		/// Profile summary.
std::unique_ptr<ProfileSummary> Summary;		std::unique_ptr<ProfileSummary> Summary;

/// Compute summary for this profile.		/// Compute summary for this profile.
void computeSummary(const SampleProfileMap &ProfileMap);		void computeSummary(const SampleProfileMap &ProfileMap);
▲ Show 20 Lines • Show All 309 Lines • Show Last 20 Lines

llvm/lib/ProfileData/SampleProfWriter.cpp

Show All 24 Lines
#include "llvm/Support/Endian.h"		#include "llvm/Support/Endian.h"
#include "llvm/Support/EndianStream.h"		#include "llvm/Support/EndianStream.h"
#include "llvm/Support/ErrorOr.h"		#include "llvm/Support/ErrorOr.h"
#include "llvm/Support/FileSystem.h"		#include "llvm/Support/FileSystem.h"
#include "llvm/Support/LEB128.h"		#include "llvm/Support/LEB128.h"
#include "llvm/Support/MD5.h"		#include "llvm/Support/MD5.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <algorithm>		#include <algorithm>
		#include <cmath>
#include <cstdint>		#include <cstdint>
#include <memory>		#include <memory>
#include <set>		#include <set>
#include <system_error>		#include <system_error>
#include <utility>		#include <utility>
#include <vector>		#include <vector>

		#define DEBUG_TYPE "llvm-profdata"

using namespace llvm;		using namespace llvm;
using namespace sampleprof;		using namespace sampleprof;

		namespace llvm {
		namespace support {
		namespace endian {
		namespace {

		// Adapter class to llvm::support::endian::Writer for pwrite().
		struct SeekableWriter {
		raw_pwrite_stream &OS;
		endianness Endian;
		SeekableWriter(raw_pwrite_stream &OS, endianness Endian)
		: OS(OS), Endian(Endian) {}

		template <typename ValueType>
		snehasishUnsubmitted Done Reply Inline Actions I think this should be ValueType based on the guidance here: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly snehasish: I think this should be ValueType based on the guidance here: https://llvm.
		void pwrite(ValueType Val, size_t Offset) {
		std::string StringBuf;
		raw_string_ostream SStream(StringBuf);
		Writer(SStream, Endian).write(Val);
		OS.pwrite(StringBuf.data(), StringBuf.size(), Offset);
		}
		};

		} // namespace
		} // namespace endian
		} // namespace support
		} // namespace llvm

		DefaultFunctionPruningStrategy::DefaultFunctionPruningStrategy(
		SampleProfileMap &ProfileMap, size_t OutputSizeLimit)
		: FunctionPruningStrategy(ProfileMap, OutputSizeLimit) {
		sortFuncProfiles(ProfileMap, SortedFunctions);
		}

		void DefaultFunctionPruningStrategy::Erase(size_t CurrentOutputSize) {
		double D = (double)OutputSizeLimit / CurrentOutputSize;
		size_t NewSize = (size_t)round(ProfileMap.size() * D * D);
		size_t NumToRemove = ProfileMap.size() - NewSize;
		if (NumToRemove < 1)
		NumToRemove = 1;

		assert(NumToRemove <= SortedFunctions.size());
		llvm::for_each(
		llvm::make_range(SortedFunctions.begin() + SortedFunctions.size() -
		NumToRemove,
		SortedFunctions.end()),
		[&](const NameFunctionSamples &E) { ProfileMap.erase(E.first); });
		SortedFunctions.resize(SortedFunctions.size() - NumToRemove);
		}

		std::error_code SampleProfileWriter::writeWithSizeLimitInternal(
		SampleProfileMap &ProfileMap, size_t OutputSizeLimit,
		FunctionPruningStrategy *Strategy) {
		if (OutputSizeLimit == 0)
		return write(ProfileMap);

		size_t OriginalFunctionCount = ProfileMap.size();

		SmallVector<char> StringBuffer;
		std::unique_ptr<raw_ostream> BufferStream(
		new raw_svector_ostream(StringBuffer));
		OutputStream.swap(BufferStream);

		if (std::error_code EC = write(ProfileMap))
		return EC;
		size_t IterationCount = 0;
		while (StringBuffer.size() > OutputSizeLimit) {
		Strategy->Erase(StringBuffer.size());

		if (ProfileMap.size() == 0)
		return sampleprof_error::too_large;

		StringBuffer.clear();
		OutputStream.reset(new raw_svector_ostream(StringBuffer));
		if (std::error_code EC = write(ProfileMap))
		return EC;
		IterationCount++;
		}

		OutputStream.swap(BufferStream);
		OutputStream->write(StringBuffer.data(), StringBuffer.size());
		LLVM_DEBUG(dbgs() << "Profile originally has " << OriginalFunctionCount
		chapuniUnsubmitted Not Done Reply Inline Actions FYI, OriginalFunctionCount was unused but fixed in rG9f4a9d3f4450 chapuni: FYI, OriginalFunctionCount was unused but fixed in rG9f4a9d3f4450
		<< " functions, reduced to " << ProfileMap.size() << " in "
		<< IterationCount << " iterations\n");
		chapuniUnsubmitted Not Done Reply Inline Actions IterationCount is used only here. chapuni: IterationCount is used only here.
		return sampleprof_error::success;
		}

std::error_code		std::error_code
SampleProfileWriter::writeFuncProfiles(const SampleProfileMap &ProfileMap) {		SampleProfileWriter::writeFuncProfiles(const SampleProfileMap &ProfileMap) {
std::vector<NameFunctionSamples> V;		std::vector<NameFunctionSamples> V;
sortFuncProfiles(ProfileMap, V);		sortFuncProfiles(ProfileMap, V);
for (const auto &I : V) {		for (const auto &I : V) {
if (std::error_code EC = writeSample(*I.second))		if (std::error_code EC = writeSample(*I.second))
return EC;		return EC;
}		}
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	std::error_code SampleProfileWriterExtBinaryBase::addNewSection(
}		}
SecHdrTable.push_back({Type, Entry.Flags, SectionStart - FileStart,		SecHdrTable.push_back({Type, Entry.Flags, SectionStart - FileStart,
OutputStream->tell() - SectionStart, LayoutIdx});		OutputStream->tell() - SectionStart, LayoutIdx});
return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code		std::error_code
SampleProfileWriterExtBinaryBase::write(const SampleProfileMap &ProfileMap) {		SampleProfileWriterExtBinaryBase::write(const SampleProfileMap &ProfileMap) {
		// When calling write on a different profile map, existing states should be
		// cleared.
		NameTable.clear();
		CSNameTable.clear();
		SecHdrTable.clear();

if (std::error_code EC = writeHeader(ProfileMap))		if (std::error_code EC = writeHeader(ProfileMap))
return EC;		return EC;

std::string LocalBuf;		std::string LocalBuf;
LocalBufStream = std::make_unique<raw_string_ostream>(LocalBuf);		LocalBufStream = std::make_unique<raw_string_ostream>(LocalBuf);
if (std::error_code EC = writeSections(ProfileMap))		if (std::error_code EC = writeSections(ProfileMap))
return EC;		return EC;

▲ Show 20 Lines • Show All 402 Lines • ▼ Show 20 Lines
SampleProfileWriterBinary::writeContextIdx(const SampleContext &Context) {		SampleProfileWriterBinary::writeContextIdx(const SampleContext &Context) {
assert(!Context.hasContext() && "cs profile is not supported");		assert(!Context.hasContext() && "cs profile is not supported");
return writeNameIdx(Context.getName());		return writeNameIdx(Context.getName());
}		}

std::error_code SampleProfileWriterBinary::writeNameIdx(StringRef FName) {		std::error_code SampleProfileWriterBinary::writeNameIdx(StringRef FName) {
auto &NTable = getNameTable();		auto &NTable = getNameTable();
const auto &Ret = NTable.find(FName);		const auto &Ret = NTable.find(FName);
if (Ret == NTable.end())		if (Ret == NTable.end())
		davidxlUnsubmitted Not Done Reply Inline Actions unneeded change davidxl: unneeded change
return sampleprof_error::truncated_name_table;		return sampleprof_error::truncated_name_table;
encodeULEB128(Ret->second, *OutputStream);		encodeULEB128(Ret->second, *OutputStream);
return sampleprof_error::success;		return sampleprof_error::success;
}		}

void SampleProfileWriterBinary::addName(StringRef FName) {		void SampleProfileWriterBinary::addName(StringRef FName) {
auto &NTable = getNameTable();		auto &NTable = getNameTable();
NTable.insert(std::make_pair(FName, 0));		NTable.insert(std::make_pair(FName, 0));
		davidxlUnsubmitted Not Done Reply Inline Actions unneeded change. davidxl: unneeded change.
}		}

void SampleProfileWriterBinary::addContext(const SampleContext &Context) {		void SampleProfileWriterBinary::addContext(const SampleContext &Context) {
addName(Context.getName());		addName(Context.getName());
}		}

void SampleProfileWriterBinary::addNames(const FunctionSamples &S) {		void SampleProfileWriterBinary::addNames(const FunctionSamples &S) {
// Add all the names in indirect call targets.		// Add all the names in indirect call targets.
Show All 9 Lines	for (const auto &FS : J.second) {
const FunctionSamples &CalleeSamples = FS.second;		const FunctionSamples &CalleeSamples = FS.second;
addName(CalleeSamples.getName());		addName(CalleeSamples.getName());
addNames(CalleeSamples);		addNames(CalleeSamples);
}		}
}		}

void SampleProfileWriterExtBinaryBase::addContext(		void SampleProfileWriterExtBinaryBase::addContext(
const SampleContext &Context) {		const SampleContext &Context) {
if (Context.hasContext()) {		if (Context.hasContext()) {
		davidxlUnsubmitted Not Done Reply Inline Actions The clear call changes the behavior. Why is it needed? davidxl: The clear call changes the behavior. Why is it needed?
		huangjdAuthorUnsubmitted Done Reply Inline Actions Name table should only contains names exist in the current ProfileMap. The original implementation adds to the name table when writing a new profile, and the old names are never cleared, which is actually a bug. (However SampleProfileWriter is single use, an instance never calls write twice, so this bug was not showing up until this new feature) huangjd: Name table should only contains names exist in the current ProfileMap. The original…
for (auto &Callsite : Context.getContextFrames())		for (auto &Callsite : Context.getContextFrames())
SampleProfileWriterBinary::addName(Callsite.FuncName);		SampleProfileWriterBinary::addName(Callsite.FuncName);
CSNameTable.insert(std::make_pair(Context, 0));		CSNameTable.insert(std::make_pair(Context, 0));
} else {		} else {
SampleProfileWriterBinary::addName(Context.getName());		SampleProfileWriterBinary::addName(Context.getName());
}		}
}		}

Show All 20 Lines	std::error_code SampleProfileWriterBinary::writeNameTable() {
}		}
return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code SampleProfileWriterCompactBinary::writeFuncOffsetTable() {		std::error_code SampleProfileWriterCompactBinary::writeFuncOffsetTable() {
auto &OS = *OutputStream;		auto &OS = *OutputStream;

// Fill the slot remembered by TableOffset with the offset of FuncOffsetTable.		// Fill the slot remembered by TableOffset with the offset of FuncOffsetTable.
auto &OFS = static_cast<raw_fd_ostream &>(OS);
uint64_t FuncOffsetTableStart = OS.tell();		uint64_t FuncOffsetTableStart = OS.tell();
if (OFS.seek(TableOffset) == (uint64_t)-1)		support::endian::SeekableWriter Writer(static_cast<raw_pwrite_stream &>(OS),
return sampleprof_error::ostream_seek_unsupported;		support::little);
support::endian::Writer Writer(*OutputStream, support::little);		Writer.pwrite(FuncOffsetTableStart, TableOffset);
Writer.write(FuncOffsetTableStart);
if (OFS.seek(FuncOffsetTableStart) == (uint64_t)-1)
return sampleprof_error::ostream_seek_unsupported;

// Write out the table size.		// Write out the table size.
encodeULEB128(FuncOffsetTable.size(), OS);		encodeULEB128(FuncOffsetTable.size(), OS);

// Write out FuncOffsetTable.		// Write out FuncOffsetTable.
for (auto Entry : FuncOffsetTable) {		for (auto Entry : FuncOffsetTable) {
if (std::error_code EC = writeNameIdx(Entry.first))		if (std::error_code EC = writeNameIdx(Entry.first))
return EC;		return EC;
Show All 21 Lines	SampleProfileWriterBinary::writeMagicIdent(SampleProfileFormat Format) {
// Write file magic identifier.		// Write file magic identifier.
encodeULEB128(SPMagic(Format), OS);		encodeULEB128(SPMagic(Format), OS);
encodeULEB128(SPVersion(), OS);		encodeULEB128(SPVersion(), OS);
return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code		std::error_code
SampleProfileWriterBinary::writeHeader(const SampleProfileMap &ProfileMap) {		SampleProfileWriterBinary::writeHeader(const SampleProfileMap &ProfileMap) {
		// When calling write on a different profile map, existing names should be
		// cleared.
		NameTable.clear();

writeMagicIdent(Format);		writeMagicIdent(Format);

computeSummary(ProfileMap);		computeSummary(ProfileMap);
if (auto EC = writeSummary())		if (auto EC = writeSummary())
return EC;		return EC;

// Generate the name table for all the functions referenced in the profile.		// Generate the name table for all the functions referenced in the profile.
for (const auto &I : ProfileMap) {		for (const auto &I : ProfileMap) {
Show All 24 Lines	for (uint32_t i = 0; i < SectionHdrLayout.size(); i++) {
Writer.write(static_cast<uint64_t>(-1));		Writer.write(static_cast<uint64_t>(-1));
Writer.write(static_cast<uint64_t>(-1));		Writer.write(static_cast<uint64_t>(-1));
Writer.write(static_cast<uint64_t>(-1));		Writer.write(static_cast<uint64_t>(-1));
Writer.write(static_cast<uint64_t>(-1));		Writer.write(static_cast<uint64_t>(-1));
}		}
}		}

std::error_code SampleProfileWriterExtBinaryBase::writeSecHdrTable() {		std::error_code SampleProfileWriterExtBinaryBase::writeSecHdrTable() {
auto &OFS = static_cast<raw_fd_ostream &>(*OutputStream);
uint64_t Saved = OutputStream->tell();

// Set OutputStream to the location saved in SecHdrTableOffset.
if (OFS.seek(SecHdrTableOffset) == (uint64_t)-1)
return sampleprof_error::ostream_seek_unsupported;
support::endian::Writer Writer(*OutputStream, support::little);

assert(SecHdrTable.size() == SectionHdrLayout.size() &&		assert(SecHdrTable.size() == SectionHdrLayout.size() &&
"SecHdrTable entries doesn't match SectionHdrLayout");		"SecHdrTable entries doesn't match SectionHdrLayout");
SmallVector<uint32_t, 16> IndexMap(SecHdrTable.size(), -1);		SmallVector<uint32_t, 16> IndexMap(SecHdrTable.size(), -1);
for (uint32_t TableIdx = 0; TableIdx < SecHdrTable.size(); TableIdx++) {		for (uint32_t TableIdx = 0; TableIdx < SecHdrTable.size(); TableIdx++) {
IndexMap[SecHdrTable[TableIdx].LayoutIndex] = TableIdx;		IndexMap[SecHdrTable[TableIdx].LayoutIndex] = TableIdx;
}		}

// Write the section header table in the order specified in		// Write the section header table in the order specified in
// SectionHdrLayout. SectionHdrLayout specifies the sections		// SectionHdrLayout. SectionHdrLayout specifies the sections
// order in which profile reader expect to read, so the section		// order in which profile reader expect to read, so the section
// header table should be written in the order in SectionHdrLayout.		// header table should be written in the order in SectionHdrLayout.
// Note that the section order in SecHdrTable may be different		// Note that the section order in SecHdrTable may be different
// from the order in SectionHdrLayout, for example, SecFuncOffsetTable		// from the order in SectionHdrLayout, for example, SecFuncOffsetTable
// needs to be computed after SecLBRProfile (the order in SecHdrTable),		// needs to be computed after SecLBRProfile (the order in SecHdrTable),
// but it needs to be read before SecLBRProfile (the order in		// but it needs to be read before SecLBRProfile (the order in
// SectionHdrLayout). So we use IndexMap above to switch the order.		// SectionHdrLayout). So we use IndexMap above to switch the order.
		support::endian::SeekableWriter Writer(
		static_cast<raw_pwrite_stream &>(*OutputStream), support::little);
for (uint32_t LayoutIdx = 0; LayoutIdx < SectionHdrLayout.size();		for (uint32_t LayoutIdx = 0; LayoutIdx < SectionHdrLayout.size();
LayoutIdx++) {		LayoutIdx++) {
assert(IndexMap[LayoutIdx] < SecHdrTable.size() &&		assert(IndexMap[LayoutIdx] < SecHdrTable.size() &&
"Incorrect LayoutIdx in SecHdrTable");		"Incorrect LayoutIdx in SecHdrTable");
auto Entry = SecHdrTable[IndexMap[LayoutIdx]];		auto Entry = SecHdrTable[IndexMap[LayoutIdx]];
Writer.write(static_cast<uint64_t>(Entry.Type));		Writer.pwrite(static_cast<uint64_t>(Entry.Type),
Writer.write(static_cast<uint64_t>(Entry.Flags));		SecHdrTableOffset + 4 * LayoutIdx * sizeof(uint64_t));
Writer.write(static_cast<uint64_t>(Entry.Offset));		Writer.pwrite(static_cast<uint64_t>(Entry.Flags),
Writer.write(static_cast<uint64_t>(Entry.Size));		SecHdrTableOffset + (4 * LayoutIdx + 1) * sizeof(uint64_t));
		Writer.pwrite(static_cast<uint64_t>(Entry.Offset),
		SecHdrTableOffset + (4 * LayoutIdx + 2) * sizeof(uint64_t));
		Writer.pwrite(static_cast<uint64_t>(Entry.Size),
		SecHdrTableOffset + (4 * LayoutIdx + 3) * sizeof(uint64_t));
}		}

// Reset OutputStream.
if (OFS.seek(Saved) == (uint64_t)-1)
return sampleprof_error::ostream_seek_unsupported;

return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code SampleProfileWriterExtBinaryBase::writeHeader(		std::error_code SampleProfileWriterExtBinaryBase::writeHeader(
const SampleProfileMap &ProfileMap) {		const SampleProfileMap &ProfileMap) {
auto &OS = *OutputStream;		auto &OS = *OutputStream;
FileStart = OS.tell();		FileStart = OS.tell();
writeMagicIdent(Format);		writeMagicIdent(Format);
▲ Show 20 Lines • Show All 160 Lines • Show Last 20 Lines

llvm/test/tools/llvm-profdata/output-size-limit.test

This file was added.

				Tests for output-size-limit option. Functions with least sample count are dropped.

				1- No effect if output size limit >= original size
				RUN: llvm-profdata merge --sample --text --output-size-limit=212 %p/Inputs/sample-profile.proftext \| FileCheck %s --check-prefix=TEST_TEXT1
				TEST_TEXT1: main:184019:0
				TEST_TEXT1-NEXT: 4: 534
				snehasishUnsubmitted Done Reply Inline Actions I don't think we should use -DAG here since that means the test will pass if the lines are reordered. However, if we reorder the symbol line with its contents the text format will be incorrect. I think the -DAG directive is useful if there is non-determinism in the text format where profile information for a whole symbol may appear before another symbol. In this case we should separate out the CHECKs like the example in [1]. Though we should fix non-deterministic output if this is the case. [1] https://llvm.org/docs/CommandGuide/FileCheck.html#the-check-dag-directive snehasish:* I don't think we should use *-DAG here since that means the test will pass if the lines are…
				TEST_TEXT1-NEXT: 4.2: 534
				TEST_TEXT1-NEXT: 5: 1075
				TEST_TEXT1-NEXT: 5.1: 1075
				TEST_TEXT1-NEXT: 6: 2080
				TEST_TEXT1-NEXT: 7: 534
				TEST_TEXT1-NEXT: 9: 2064 _Z3bari:1471 _Z3fooi:631
				TEST_TEXT1-NEXT: 10: inline1:1000
				TEST_TEXT1-NEXT: 1: 1000
				TEST_TEXT1-NEXT: 10: inline2:2000
				TEST_TEXT1-NEXT: 1: 2000
				TEST_TEXT1-NEXT: _Z3bari:20301:1437
				TEST_TEXT1-NEXT: 1: 1437
				TEST_TEXT1-NEXT: _Z3fooi:7711:610
				TEST_TEXT1-NEXT: 1: 610

				2- 1 function dropped
				RUN: llvm-profdata merge --sample --text --output-size-limit=211 %p/Inputs/sample-profile.proftext \| FileCheck %s --check-prefix=TEST_TEXT2
				RUN: llvm-profdata merge --sample --text --output-size-limit=187 %p/Inputs/sample-profile.proftext \| FileCheck %s --check-prefix=TEST_TEXT2
				TEST_TEXT2: main:184019:0
				TEST_TEXT2-NEXT: 4: 534
				TEST_TEXT2-NEXT: 4.2: 534
				TEST_TEXT2-NEXT: 5: 1075
				TEST_TEXT2-NEXT: 5.1: 1075
				TEST_TEXT2-NEXT: 6: 2080
				TEST_TEXT2-NEXT: 7: 534
				TEST_TEXT2-NEXT: 9: 2064 _Z3bari:1471 _Z3fooi:631
				TEST_TEXT2-NEXT: 10: inline1:1000
				TEST_TEXT2-NEXT: 1: 1000
				TEST_TEXT2-NEXT: 10: inline2:2000
				TEST_TEXT2-NEXT: 1: 2000
				TEST_TEXT2-NEXT: _Z3bari:20301:1437
				TEST_TEXT2-NEXT: 1: 1437

				3- 2 functions dropped
				RUN: llvm-profdata merge --sample --text --output-size-limit=170 %p/Inputs/sample-profile.proftext \| FileCheck %s --check-prefix=TEST_TEXT3
				TEST_TEXT3: main:184019:0
				TEST_TEXT3-NEXT: 4: 534
				TEST_TEXT3-NEXT: 4.2: 534
				TEST_TEXT3-NEXT: 5: 1075
				TEST_TEXT3-NEXT: 5.1: 1075
				TEST_TEXT3-NEXT: 6: 2080
				TEST_TEXT3-NEXT: 7: 534
				TEST_TEXT3-NEXT: 9: 2064 _Z3bari:1471 _Z3fooi:631
				TEST_TEXT3-NEXT: 10: inline1:1000
				TEST_TEXT3-NEXT: 1: 1000
				TEST_TEXT3-NEXT: 10: inline2:2000
				TEST_TEXT3-NEXT: 1: 2000

				4- All functions dropped, should report an error
				RUN: not llvm-profdata merge --sample --text --output-size-limit=158 %p/Inputs/sample-profile.proftext 2>&1 \| FileCheck %s --check-prefix=INVALID1
				INVALID1: error: Too much profile data

				5- ExtBinary form, no function dropped. Check output size and file content converted back to text
				RUN: llvm-profdata merge --sample --extbinary --output-size-limit=489 %p/Inputs/sample-profile.proftext -o %t.output
				RUN: test $(stat -c %%s %t.output) -le 489
				dyungUnsubmitted Not Done Reply Inline Actions On Windows this seems to be expanded in a way you probably did not expect: (https://lab.llvm.org/buildbot/#/builders/216/builds/15553/steps/7/logs/FAIL__LLVM__output-size-limit_test) `"$(stat" "-c" "%s" "Z:\test\build\test\tools\llvm-profdata\Output\output-size-limit.test.tmp.output)"` dyung: On Windows this seems to be expanded in a way you probably did not expect: (https://lab.llvm.
				RUN: llvm-profdata merge --sample --text %t.output \| FileCheck %s --check-prefix=TEST_EXTBINARY1
				TEST_EXTBINARY1: main:184019:0
				TEST_EXTBINARY1-NEXT: 4: 534
				TEST_EXTBINARY1-NEXT: 4.2: 534
				TEST_EXTBINARY1-NEXT: 5: 1075
				TEST_EXTBINARY1-NEXT: 5.1: 1075
				TEST_EXTBINARY1-NEXT: 6: 2080
				TEST_EXTBINARY1-NEXT: 7: 534
				TEST_EXTBINARY1-NEXT: 9: 2064 _Z3bari:1471 _Z3fooi:631
				TEST_EXTBINARY1-NEXT: 10: inline1:1000
				TEST_EXTBINARY1-NEXT: 1: 1000
				TEST_EXTBINARY1-NEXT: 10: inline2:2000
				TEST_EXTBINARY1-NEXT: 1: 2000
				TEST_EXTBINARY1-NEXT: _Z3bari:20301:1437
				TEST_EXTBINARY1-NEXT: 1: 1437
				TEST_EXTBINARY1-NEXT: _Z3fooi:7711:610
				TEST_EXTBINARY1-NEXT: 1: 610

				6- ExtBinary form, 1 function dropped
				RUN: llvm-profdata merge --sample --extbinary --output-size-limit=488 %p/Inputs/sample-profile.proftext -o %t.output
				RUN: test $(stat -c %%s %t.output) -le 488
				RUN: llvm-profdata merge --sample --text %t.output \| FileCheck %s --check-prefix=TEST_EXTBINARY2
				TEST_EXTBINARY2: main:184019:0
				TEST_EXTBINARY2-NEXT: 4: 534
				TEST_EXTBINARY2-NEXT: 4.2: 534
				TEST_EXTBINARY2-NEXT: 5: 1075
				TEST_EXTBINARY2-NEXT: 5.1: 1075
				TEST_EXTBINARY2-NEXT: 6: 2080
				TEST_EXTBINARY2-NEXT: 7: 534
				TEST_EXTBINARY2-NEXT: 9: 2064 _Z3bari:1471 _Z3fooi:631
				TEST_EXTBINARY2-NEXT: 10: inline1:1000
				TEST_EXTBINARY2-NEXT: 1: 1000
				TEST_EXTBINARY2-NEXT: 10: inline2:2000
				TEST_EXTBINARY2-NEXT: 1: 2000
				TEST_EXTBINARY2-NEXT: _Z3bari:20301:1437
				TEST_EXTBINARY2-NEXT: 1: 1437

				7- ExtBinary form, 2 functions dropped
				RUN: llvm-profdata merge --sample --extbinary --output-size-limit=474 %p/Inputs/sample-profile.proftext -o %t.output
				RUN: test $(stat -c %%s %t.output) -le 474
				RUN: llvm-profdata merge --sample --text %t.output \| FileCheck %s --check-prefix=TEST_EXTBINARY3
				TEST_EXTBINARY3: main:184019:0
				TEST_EXTBINARY3-NEXT: 4: 534
				TEST_EXTBINARY3-NEXT: 4.2: 534
				TEST_EXTBINARY3-NEXT: 5: 1075
				TEST_EXTBINARY3-NEXT: 5.1: 1075
				TEST_EXTBINARY3-NEXT: 6: 2080
				TEST_EXTBINARY3-NEXT: 7: 534
				TEST_EXTBINARY3-NEXT: 9: 2064 _Z3bari:1471 _Z3fooi:631
				TEST_EXTBINARY3-NEXT: 10: inline1:1000
				TEST_EXTBINARY3-NEXT: 1: 1000
				TEST_EXTBINARY3-NEXT: 10: inline2:2000
				TEST_EXTBINARY3-NEXT: 1: 2000

				8- ExtBinary form, all functions dropped
				RUN: not llvm-profdata merge --sample --extbinary --output-size-limit=400 %p/Inputs/sample-profile.proftext 2>&1 \| FileCheck %s --check-prefix=INVALID2
				INVALID2: error: Too much profile data

llvm/tools/llvm-profdata/llvm-profdata.cpp

Show First 20 Lines • Show All 961 Lines • ▼ Show 20 Lines

static void		static void
mergeSampleProfile(const WeightedFileVector &Inputs, SymbolRemapper *Remapper,		mergeSampleProfile(const WeightedFileVector &Inputs, SymbolRemapper *Remapper,
StringRef OutputFilename, ProfileFormat OutputFormat,		StringRef OutputFilename, ProfileFormat OutputFormat,
StringRef ProfileSymbolListFile, bool CompressAllSections,		StringRef ProfileSymbolListFile, bool CompressAllSections,
bool UseMD5, bool GenPartialProfile, bool GenCSNestedProfile,		bool UseMD5, bool GenPartialProfile, bool GenCSNestedProfile,
bool SampleMergeColdContext, bool SampleTrimColdContext,		bool SampleMergeColdContext, bool SampleTrimColdContext,
bool SampleColdContextFrameDepth, FailureMode FailMode,		bool SampleColdContextFrameDepth, FailureMode FailMode,
bool DropProfileSymbolList) {		bool DropProfileSymbolList, size_t OutputSizeLimit) {
using namespace sampleprof;		using namespace sampleprof;
SampleProfileMap ProfileMap;		SampleProfileMap ProfileMap;
SmallVector<std::unique_ptr<sampleprof::SampleProfileReader>, 5> Readers;		SmallVector<std::unique_ptr<sampleprof::SampleProfileReader>, 5> Readers;
LLVMContext Context;		LLVMContext Context;
sampleprof::ProfileSymbolList WriterList;		sampleprof::ProfileSymbolList WriterList;
std::optional<bool> ProfileIsProbeBased;		std::optional<bool> ProfileIsProbeBased;
std::optional<bool> ProfileIsCS;		std::optional<bool> ProfileIsCS;
for (const auto &Input : Inputs) {		for (const auto &Input : Inputs) {
auto ReaderOrErr = SampleProfileReader::create(Input.Filename, Context,		auto ReaderOrErr = SampleProfileReader::create(Input.Filename, Context,
FSDiscriminatorPassOption);		FSDiscriminatorPassOption);
if (std::error_code EC = ReaderOrErr.getError()) {		if (std::error_code EC = ReaderOrErr.getError()) {
warnOrExitGivenError(FailMode, EC, Input.Filename);		warnOrExitGivenError(FailMode, EC, Input.Filename);
continue;		continue;
}		}

// We need to keep the readers around until after all the files are		// We need to keep the readers around until after all the files are
// read so that we do not lose the function names stored in each		// read so that we do not lose the function names stored in each
// reader's memory. The function names are needed to write out the		// reader's memory. The function names are needed to write out the
		snehasishUnsubmitted Done Reply Inline Actions How about moving this to SampleProfileWriter so that we can use this API in other tooling where we don't invoke llvm-profdata? snehasish: How about moving this to SampleProfileWriter so that we can use this API in other tooling where…
		snehasishUnsubmitted Done Reply Inline Actions Re-opening since I think we could move the RewriteProfileSizeLimit and CalculateNumFunctionsToRemove method to SampleProfileWriter as well so that this heuristic (and subsequent updates to it) can be reused in internal tooling directly. snehasish: Re-opening since I think we could move the RewriteProfileSizeLimit and…
		huangjdAuthorUnsubmitted Done Reply Inline Actions CalculateNumFunctionToRemove is moved inside FunctionPruningStrategy since it won't be used anywhere else. It can be overriden if necessary huangjd: CalculateNumFunctionToRemove is moved inside FunctionPruningStrategy since it won't be used…
// merged profile map.		// merged profile map.
Readers.push_back(std::move(ReaderOrErr.get()));		Readers.push_back(std::move(ReaderOrErr.get()));
const auto Reader = Readers.back().get();		const auto Reader = Readers.back().get();
if (std::error_code EC = Reader->read()) {		if (std::error_code EC = Reader->read()) {
warnOrExitGivenError(FailMode, EC, Input.Filename);		warnOrExitGivenError(FailMode, EC, Input.Filename);
Readers.pop_back();		Readers.pop_back();
continue;		continue;
}		}
Show All 30 Lines	if (!DropProfileSymbolList) {
WriterList.merge(*ReaderList);		WriterList.merge(*ReaderList);
}		}
}		}

if (ProfileIsCS && (SampleMergeColdContext \|\| SampleTrimColdContext)) {		if (ProfileIsCS && (SampleMergeColdContext \|\| SampleTrimColdContext)) {
// Use threshold calculated from profile summary unless specified.		// Use threshold calculated from profile summary unless specified.
SampleProfileSummaryBuilder Builder(ProfileSummaryBuilder::DefaultCutoffs);		SampleProfileSummaryBuilder Builder(ProfileSummaryBuilder::DefaultCutoffs);
auto Summary = Builder.computeSummaryForProfiles(ProfileMap);		auto Summary = Builder.computeSummaryForProfiles(ProfileMap);
uint64_t SampleProfColdThreshold =		uint64_t SampleProfColdThreshold =
		snehasishUnsubmitted Done Reply Inline Actions Perhaps wrap this in DEBUG(), I'm not sure whether it's useful to have this message all the time. snehasish: Perhaps wrap this in DEBUG(), I'm not sure whether it's useful to have this message all the…
ProfileSummaryBuilder::getColdCountThreshold(		ProfileSummaryBuilder::getColdCountThreshold(
(Summary->getDetailedSummary()));		(Summary->getDetailedSummary()));

// Trim and merge cold context profile using cold threshold above;		// Trim and merge cold context profile using cold threshold above;
SampleContextTrimmer(ProfileMap)		SampleContextTrimmer(ProfileMap)
.trimAndMergeColdContextProfiles(		.trimAndMergeColdContextProfiles(
SampleProfColdThreshold, SampleTrimColdContext,		SampleProfColdThreshold, SampleTrimColdContext,
SampleMergeColdContext, SampleColdContextFrameDepth, false);		SampleMergeColdContext, SampleColdContextFrameDepth, false);
}		}

if (ProfileIsCS && GenCSNestedProfile) {		if (ProfileIsCS && GenCSNestedProfile) {
CSProfileConverter CSConverter(ProfileMap);		CSProfileConverter CSConverter(ProfileMap);
CSConverter.convertProfiles();		CSConverter.convertProfiles();
ProfileIsCS = FunctionSamples::ProfileIsCS = false;		ProfileIsCS = FunctionSamples::ProfileIsCS = false;
}		}

		// If limiting the output size, write to a string buffer first, and drop
		// functions if the output size exceeds limit. This iterates multiple times
		// until the limit is satisfied.
		SmallVector<char> StringBuffer;
		std::unique_ptr<raw_ostream> BufferStream(
		new raw_svector_ostream(StringBuffer));

auto WriterOrErr =		auto WriterOrErr =
SampleProfileWriter::create(OutputFilename, FormatMap[OutputFormat]);		SampleProfileWriter::create(OutputFilename, FormatMap[OutputFormat]);
if (std::error_code EC = WriterOrErr.getError())		if (std::error_code EC = WriterOrErr.getError())
exitWithErrorCode(EC, OutputFilename);		exitWithErrorCode(EC, OutputFilename);

auto Writer = std::move(WriterOrErr.get());		auto Writer = std::move(WriterOrErr.get());
// WriterList will have StringRef refering to string in Buffer.		// WriterList will have StringRef refering to string in Buffer.
// Make sure Buffer lives as long as WriterList.		// Make sure Buffer lives as long as WriterList.
auto Buffer = getInputFileBuf(ProfileSymbolListFile);		auto Buffer = getInputFileBuf(ProfileSymbolListFile);
handleExtBinaryWriter(*Writer, OutputFormat, Buffer.get(), WriterList,		handleExtBinaryWriter(*Writer, OutputFormat, Buffer.get(), WriterList,
CompressAllSections, UseMD5, GenPartialProfile);		CompressAllSections, UseMD5, GenPartialProfile);
if (std::error_code EC = Writer->write(ProfileMap))
		if (std::error_code EC =
		Writer->writeWithSizeLimit(ProfileMap, OutputSizeLimit))
exitWithErrorCode(std::move(EC));		exitWithErrorCode(std::move(EC));
}		}

		davidxlUnsubmitted Done Reply Inline Actions Extract the following to a helper function. davidxl: Extract the following to a helper function.
static WeightedFile parseWeightedFile(const StringRef &WeightedFilename) {		static WeightedFile parseWeightedFile(const StringRef &WeightedFilename) {
StringRef WeightStr, FileName;		StringRef WeightStr, FileName;
std::tie(WeightStr, FileName) = WeightedFilename.split(',');		std::tie(WeightStr, FileName) = WeightedFilename.split(',');

uint64_t Weight;		uint64_t Weight;
if (WeightStr.getAsInteger(10, Weight) \|\| Weight < 1)		if (WeightStr.getAsInteger(10, Weight) \|\| Weight < 1)
exitWithError("input weight must be a positive integer");		exitWithError("input weight must be a positive integer");

return {std::string(FileName), Weight};		return {std::string(FileName), Weight};
		snehasishUnsubmitted Done Reply Inline Actions An if statement with initialization would make the intent clearer here: if (EC = File.error(); EC) { } snehasish: An if statement with initialization would make the intent clearer here: ``` if (EC = File.
}		}

static void addWeightedInput(WeightedFileVector &WNI, const WeightedFile &WF) {		static void addWeightedInput(WeightedFileVector &WNI, const WeightedFile &WF) {
StringRef Filename = WF.Filename;		StringRef Filename = WF.Filename;
uint64_t Weight = WF.Weight;		uint64_t Weight = WF.Weight;
		wenleiUnsubmitted Not Done Reply Inline Actions This heuristic can be quite inaccurate. The number of body sample + call site sample entries can be much better proxy for actual profile size and yet still easily accessible. `functionSamples.getBodySamples().size() + functionSamples.getCallsiteSamples().size()` wenlei: This heuristic can be quite inaccurate. The number of body sample + call site sample entries…
		huangjdAuthorUnsubmitted Done Reply Inline Actions The new revision performs the iterations on a string buffer so the performance of the heuristic is not a big concern (and now it should not overshoot by reducing too many functions). Although using a simple proportional heuristic is still too slow because it ends up reducing one function at the tail at a time huangjd: The new revision performs the iterations on a string buffer so the performance of the heuristic…
		huangjdAuthorUnsubmitted Done Reply Inline Actions Better heuristics can be added in a later patch. Need more real world profile data (industrial use) to confirm which model is the best huangjd: Better heuristics can be added in a later patch. Need more real world profile data (industrial…

// If it's STDIN just pass it on.		// If it's STDIN just pass it on.
if (Filename == "-") {		if (Filename == "-") {
WNI.push_back({std::string(Filename), Weight});		WNI.push_back({std::string(Filename), Weight});
return;		return;
}		}

llvm::sys::fs::file_status Status;		llvm::sys::fs::file_status Status;
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	static int merge_main(int argc, const char *argv[]) {
cl::opt<bool> SampleTrimColdContext(		cl::opt<bool> SampleTrimColdContext(
"sample-trim-cold-context", cl::init(false), cl::Hidden,		"sample-trim-cold-context", cl::init(false), cl::Hidden,
cl::desc(		cl::desc(
"Trim context sample profiles whose count is below cold threshold"));		"Trim context sample profiles whose count is below cold threshold"));
cl::opt<uint32_t> SampleColdContextFrameDepth(		cl::opt<uint32_t> SampleColdContextFrameDepth(
"sample-frame-depth-for-cold-context", cl::init(1),		"sample-frame-depth-for-cold-context", cl::init(1),
cl::desc("Keep the last K frames while merging cold profile. 1 means the "		cl::desc("Keep the last K frames while merging cold profile. 1 means the "
"context-less base profile"));		"context-less base profile"));
		cl::opt<size_t> OutputSizeLimit(
		"output-size-limit", cl::init(0), cl::Hidden,
		snehasishUnsubmitted Not Done Reply Inline Actions I think this is generally useful and we should make it visible. snehasish: I think this is generally useful and we should make it visible.
		cl::desc("Trim cold functions until profile size is below specified "
		"limit in bytes. This uses a heursitic and functions may be "
		snehasishUnsubmitted Done Reply Inline Actions nit: Just heuristic instead of "heuristic algorithm"? snehasish: nit: Just heuristic instead of "heuristic algorithm"?
		"excessively trimmed"));
cl::opt<bool> GenPartialProfile(		cl::opt<bool> GenPartialProfile(
"gen-partial-profile", cl::init(false), cl::Hidden,		"gen-partial-profile", cl::init(false), cl::Hidden,
cl::desc("Generate a partial profile (only meaningful for -extbinary)"));		cl::desc("Generate a partial profile (only meaningful for -extbinary)"));
cl::opt<std::string> SupplInstrWithSample(		cl::opt<std::string> SupplInstrWithSample(
"supplement-instr-with-sample", cl::init(""), cl::Hidden,		"supplement-instr-with-sample", cl::init(""), cl::Hidden,
cl::desc("Supplement an instr profile with sample profile, to correct "		cl::desc("Supplement an instr profile with sample profile, to correct "
"the profile unrepresentativeness issue. The sample "		"the profile unrepresentativeness issue. The sample "
"profile is the input of the flag. Output will be in instr "		"profile is the input of the flag. Output will be in instr "
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	if (ProfileKind == instr)
mergeInstrProfile(WeightedInputs, DebugInfoFilename, Remapper.get(),		mergeInstrProfile(WeightedInputs, DebugInfoFilename, Remapper.get(),
OutputFilename, OutputFormat, OutputSparse, NumThreads,		OutputFilename, OutputFormat, OutputSparse, NumThreads,
FailureMode, ProfiledBinary);		FailureMode, ProfiledBinary);
else		else
mergeSampleProfile(		mergeSampleProfile(
WeightedInputs, Remapper.get(), OutputFilename, OutputFormat,		WeightedInputs, Remapper.get(), OutputFilename, OutputFormat,
ProfileSymbolListFile, CompressAllSections, UseMD5, GenPartialProfile,		ProfileSymbolListFile, CompressAllSections, UseMD5, GenPartialProfile,
GenCSNestedProfile, SampleMergeColdContext, SampleTrimColdContext,		GenCSNestedProfile, SampleMergeColdContext, SampleTrimColdContext,
SampleColdContextFrameDepth, FailureMode, DropProfileSymbolList);		SampleColdContextFrameDepth, FailureMode, DropProfileSymbolList,
		OutputSizeLimit);
return 0;		return 0;
}		}

/// Computer the overlap b/w profile BaseFilename and profile TestFilename.		/// Computer the overlap b/w profile BaseFilename and profile TestFilename.
static void overlapInstrProfile(const std::string &BaseFilename,		static void overlapInstrProfile(const std::string &BaseFilename,
const std::string &TestFilename,		const std::string &TestFilename,
const OverlapFuncFilters &FuncFilter,		const OverlapFuncFilters &FuncFilter,
raw_fd_ostream &OS, bool IsCS) {		raw_fd_ostream &OS, bool IsCS) {
▲ Show 20 Lines • Show All 1,715 Lines • Show Last 20 Lines