This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/ProfileData/
-
llvm/
-
ProfileData/
11/19
SampleProf.h
4/8
SampleProfReader.h
-
lib/
-
ProfileData/
1
ProfileSummaryBuilder.cpp
1/3
SampleProf.cpp
3/3
SampleProfReader.cpp
-
SampleProfWriter.cpp
-
Transforms/IPO/
-
IPO/
-
SampleContextTracker.cpp
-
test/tools/llvm-profdata/
-
tools/
-
llvm-profdata/
-
Inputs/
-
sample-nametable-after-samples.profdata
1/1
sample-nametable.test
-
tools/
-
llvm-profdata/
-
llvm-profdata.cpp
-
llvm-profgen/
1/1
ProfileGenerator.cpp
-
unittests/tools/llvm-profdata/
-
tools/
-
llvm-profdata/
-
CMakeLists.txt
4/6
MD5CollisionTest.cpp
-
OutputSizeLimitTest.cpp

Differential D147740

[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map
ClosedPublic

Authored by huangjd on Apr 6 2023, 1:53 PM.

Download Raw Diff

Details

Reviewers

davidxl
xur
kazu
ellis
aeubanks
snehasish
wlei
hoy
wenlei
asbirlea

Commits

rG7624de5beae2: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed…
rG66ba71d913df: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed…
rG12e9c7aaa66b: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed…
rG31af18bccea9: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed…

Summary

This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map.

The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow.

Several changes to note:

(1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names.

(2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) )

(3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing.

(4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable.

(5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive)

Performance impact:
When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

davidxl added inline comments.May 31 2023, 9:06 AM

llvm/include/llvm/ProfileData/SampleProf.h
1331	Since this forces insert in case of conflict, should it be called just 'emplace'?
1340	potential memory leak? Also why inserting an empty FunctionSamples instead of the one passed in?
1354	what is the purpose of this wrapper?
1358	This interface is confusing. User expect it to return existing entry, but here it erases it. Should this interface be hidden (and not allowed with assert)?
1366	is this always true as currently implemented?

huangjd added inline comments.Jun 1 2023, 1:56 PM

llvm/include/llvm/ProfileData/SampleProf.h
1331	try_emplace functionally is same as emplace (https://en.cppreference.com/w/cpp/container/unordered_map/try_emplace), only difference is that try_emplace does not move the arguments to construct a mapped_type if key exists.
1354	existing code compatibility

Explaining the behavior of MD5-key FunctionSampleMap

Previously the map is {SampleContext : FunctionSamples}, where FunctionSamples holds a copy of the SampleContext (but not always enforced, for example when the map is an intermediate product of a merge, or during IPO passes). Currently the map is {Hash(SampleContext) : FunctionSamples}, where FunctionSamples holds the SampleContext immediately after insertion (otherwise the context is lost forever)

When inserting to the map, the caller must provide a SampleContext, so that when an existing entry with the same hash value is found, its SampleContext is compared with the caller's new SampleContext.

If they are the same, then it's actually a match, so the existing FunctionSamples is returned and no insertion happens.

If they are different, then there's a MD5 collision, and in this case, we have to decide which FunctionSamples to keep. The decision is to use the new FunctionSamples and erase the old one, because IPO calls SampleProfileReader:;getOrCreateSamplesFor(), and we want to return a sample with the requested function name, rather than a sample with a *different* function name/SampleContext. In this case since a new entry is inserted with a new FunctionSamples provided by the caller, return.second should be true to indicate a new value is inserted, so that the caller can perform other necessary logic as if a new entry is inserted.

There's not much we can do to keep both entries in case of a collision, as it is very rare (and probably only happens theoretically), and using a multi-map slows down too much. Note that the profile should not affect compilation correctness, so even when collision happens and profiles are dropped, it should only affect the optimization applied to related functions

Fix bug in emplace on MD5 collision

Added unit test

huangjd marked 3 inline comments as done.Jun 1 2023, 2:39 PM

huangjd added inline comments.

llvm/include/llvm/ProfileData/SampleProf.h
1358	In C++, [] is same as try_emplace with default constructed mapped_type, so I am keeping the behavior consistent. Keeping this function because many places use it.
1366	This returns false if the existing entry actually has the same context, which indicates a match, rather than a MD5 collision, so no need to set the context again (And setting the context actually erase the flags in the context, which is not used for equality comparison or hasing)

Harbormaster completed remote builds in B235992: Diff 527624.Jun 1 2023, 4:39 PM

huangjd edited the summary of this revision. (Show Details)Jun 2 2023, 7:35 PM

huangjd marked 2 inline comments as done.Jun 6 2023, 11:42 AM

huangjd added inline comments.

llvm/include/llvm/ProfileData/SampleProf.h
1340	Fixed. FunctionSamples is copy assignable

Use llvm:DenseMap for profiles since now the key is uint64

Harbormaster completed remote builds in B237123: Diff 529099.Jun 6 2023, 10:13 PM

huangjd added a reviewer: asbirlea.Jun 8 2023, 1:25 PM

huangjd added a child revision: D152320: [llvm-profdata] Use StringRef in place of string in FunctionSamplesMap.Jun 8 2023, 1:28 PM

@davidxl @wlei @wenlei @aeubanks
Could you please review the revised patch? Note that it is very different from the original one

Change test case to avoid reserved key value in Dense Map (~0ULL)

huangjd added inline comments.Jun 8 2023, 3:57 PM

llvm/test/tools/llvm-profdata/sample-nametable.test
11	Note: 0xFFFFFFFFFFFFFFFF and 0xFFFFFFFFFFFFFFFE are reserved in llvm::DenseMap and cannot be used as key. Changing it to 0xFFFFFFFFFFFFFFFD. I am not adding a check in SampleProfileMap to check for them because the assumption that a hash value is never equal them is made in so many places throughout LLVM, any check should be done inside DenseMap if actually needed.

huangjd added a child revision: D152490: [llvm-profdata] Use StringRef for CallTargetMap.Jun 8 2023, 5:57 PM

Harbormaster completed remote builds in B237611: Diff 529756.Jun 8 2023, 5:58 PM

davidxl added inline comments.Jun 9 2023, 7:45 AM

llvm/lib/ProfileData/SampleProf.cpp
403	is this a dead function? should it be removed in a separate patch?
llvm/lib/ProfileData/SampleProfReader.cpp
591	add comment explaining the benefit of lazy hash computing.

snehasish added inline comments.Jun 9 2023, 4:06 PM

llvm/include/llvm/ProfileData/SampleProf.h
324	Is there a chance that the base 10 encoding may change? Is this the only place where we generate the hashes?
1327–1328	Can this lead to non-deterministic builds?
1390	typo "function"
llvm/lib/ProfileData/SampleProfReader.cpp
526–527	Can we use a more descriptive name for the output parameters (here and elsewhere)?
llvm/unittests/tools/llvm-profdata/MD5CollisionTest.cpp
85	Can we use the text format (with some additional helper functions) here instead of the binary data? It would be hard to update in case of changes in the future.
117	This should be assert since if it doesn't hold the following lines which deref NameTable[0] and NameTable[1] will segfault.
141	Capture with structured binding here with clearer variable names to make it easier to read?

yurybura added a subscriber: yurybura.Jun 12 2023, 4:55 AM

huangjd marked an inline comment as done.Jun 13 2023, 11:57 AM

huangjd added inline comments.

llvm/include/llvm/ProfileData/SampleProf.h
324	No, and doesn't matter. Base-10 encoding is used only because the existing implementation wanted to make it compatible to represent both strings (function names) and integers (MD5), and the solution was to convert MD5 into a base-10 string. This seems inefficient and I am working on a subsequent patch to deal with it.
1327–1328	Why? There's no non-determinism here, the existing entry always gets erased first. SampleProfWriter does sort the profile before writing so it's ok
llvm/lib/ProfileData/SampleProf.cpp
403	After refactoring, the invariant key == value.getContext() is moot, so this function is dead
llvm/unittests/tools/llvm-profdata/MD5CollisionTest.cpp
85	See comments ... A unit test is required because the function /// names are not printable ASCII characters.
85	This test case is very theoretical, as I can't find two printable ASCII strings with colliding MD5 (there exists for sure, but none is known yet) In this case I have to use a unit test because I cannot validate the output using llvm-lit which requires printable characters.

Clarify some functions & comments
Rewrite unit test

huangjd marked 6 inline comments as done.Jun 13 2023, 4:18 PM

Harbormaster completed remote builds in B238621: Diff 531076.Jun 13 2023, 6:11 PM

huangjd removed a child revision: D152490: [llvm-profdata] Use StringRef for CallTargetMap.Jun 13 2023, 6:24 PM

Use hash_code for SampleContext.getHashValue() so that DenseMap does not hash the MD5 again inside the map (uint64_t get hashed again which is unnecessary here because MD5 is sufficiently sparse)

Harbormaster completed remote builds in B238963: Diff 531529.Jun 14 2023, 5:14 PM

Extracted HashKeyMap wrapper class instead of binding it to SampleProfileMap

Harbormaster completed remote builds in B239279: Diff 531964.Jun 15 2023, 7:20 PM

lgtm . Please run more extensive testing including testing with sanitizer on. Please also wait a little for further comments from other reviewers.

This revision is now accepted and ready to land.Jun 20 2023, 1:21 PM

lgtm

llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
207	Document the parameter name /ProfileIsCS=/?
llvm/lib/ProfileData/SampleProf.cpp
205	nit: drop the make_pair in favour or an initializer list?
llvm/unittests/tools/llvm-profdata/MD5CollisionTest.cpp
85	test is required because the function names below (i.e. String1 and String2) are not printable ASCII characters. I see, I got confused because I thought "String1" is used in a literal sense. Perhaps enhance the comments above to note this?

huangjd added a child revision: D153486: [llvm-profdata] GUIDToFuncNameMap can be static.Jun 21 2023, 5:45 PM

Actually do we really care about MD5 collision? ExtBinary format already ignored MD5 collision for regular string names (and therefore regular function profiles), as only one of two functions with colliding MD5 get written to the name table (and the other is therefore lost). If we are using CS profiles, since different CS profiles have different serializations, their hashes are distributed as expected. The most important thing is that, even if we detect a hash collision, we can't do anything about it except logging it (using a multi-map makes the reader much slower), so I think the MD5 collision check should be marked as LLVM_DEBUG. This does reduce 0.5 second out of ~30 seconds (1.67%) over the 1 GB profile read .

In D147740#4443233, @huangjd wrote:

Actually do we really care about MD5 collision? ExtBinary format already ignored MD5 collision for regular string names (and therefore regular function profiles), as only one of two functions with colliding MD5 get written to the name table (and the other is therefore lost). If we are using CS profiles, since different CS profiles have different serializations, their hashes are distributed as expected. The most important thing is that, even if we detect a hash collision, we can't do anything about it except logging it (using a multi-map makes the reader much slower), so I think the MD5 collision check should be marked as LLVM_DEBUG. This does reduce 0.5 second out of ~30 seconds (1.67%) over the 1 GB profile read .

I don't think we care. Is the new type HashKeyMap and SampleProfileMap all for detecting and reporting collision? I'd avoid all that complexity and prefer a simple DenseMap + a SampleContext->hash_code converter and not even bother with debug prints for collision...

Fix a few comments

This revision was landed with ongoing or failed builds.Jun 23 2023, 2:50 PM

Closed by commit rG31af18bccea9: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed… (authored by huangjd). · Explain Why

This revision was automatically updated to reflect the committed changes.

huangjd added a commit: rG31af18bccea9: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed….

Harbormaster completed remote builds in B240853: Diff 534078.Jun 23 2023, 3:23 PM

@huangjd your change seems to be causing build failures on Windows. Can you take a look and revert if you need time to investigate?

https://lab.llvm.org/buildbot/#/builders/216/builds/22833
https://lab.llvm.org/buildbot/#/builders/123/builds/19602
https://lab.llvm.org/buildbot/#/builders/172/builds/28315
https://lab.llvm.org/buildbot/#/builders/119/builds/13870
https://lab.llvm.org/buildbot/#/builders/233/builds/794
https://lab.llvm.org/buildbot/#/builders/235/builds/387
https://lab.llvm.org/buildbot/#/builders/13/builds/36921
https://lab.llvm.org/buildbot/#/builders/127/builds/50510

The error messages (VC)
https://lab.llvm.org/buildbot/#/builders/119/builds/13870/steps/7/logs/stdio

63.990 [3705/51/830] Building CXX object lib\CodeGen\CMakeFiles\LLVMCodeGen.dir\MIRSampleProfile.cpp.obj
FAILED: lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/MIRSampleProfile.cpp.obj 
C:\PROGRA~2\MICROS~1\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\cl.exe  /nologo /TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GLIBCXX_ASSERTIONS -D_HAS_EXCEPTIONS=0 -D_LIBCPP_ENABLE_ASSERTIONS -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\buildbot\as-builder-2\x-aarch64\build\lib\CodeGen -IC:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\lib\CodeGen -IC:\buildbot\as-builder-2\x-aarch64\build\include -IC:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\include -external:IC:\buildbot\.zlib-win32\include -external:W0 -D__OPTIMIZE__ /Zc:inline /Zc:preprocessor /Zc:__cplusplus /Oi /bigobj /permissive- /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd5105 -wd4324 -w14062 -we4238 /Gw /MT /O2 /Ob2  /EHs-c- /GR- -UNDEBUG -std:c++17 /showIncludes /Folib\CodeGen\CMakeFiles\LLVMCodeGen.dir\MIRSampleProfile.cpp.obj /Fdlib\CodeGen\CMakeFiles\LLVMCodeGen.dir\LLVMCodeGen.pdb /FS -c C:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\lib\CodeGen\MIRSampleProfile.cpp
C:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\include\llvm/ProfileData/SampleProf.h(1423): error C3200: 'llvm::DenseMap<llvm::hash_code,ValueT,llvm::DenseMapInfo<llvm::hash_code,void>,llvm::detail::DenseMapPair<KeyT,ValueT>>': invalid template argument for template parameter 'MapT', expected a class template
        with
        [
            ValueT=llvm::sampleprof::FunctionSamples,
            KeyT=llvm::hash_code
        ]
C:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\include\llvm/ProfileData/SampleProf.h(1427): error C3200: 'llvm::DenseMap<llvm::hash_code,ValueT,llvm::DenseMapInfo<llvm::hash_code,void>,llvm::detail::DenseMapPair<KeyT,ValueT>>': invalid template argument for template parameter 'MapT', expected a class template
        with
        [
            ValueT=llvm::sampleprof::FunctionSamples,
            KeyT=llvm::hash_code
        ]
C:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\include\llvm/ProfileData/SampleProf.h(1446): error C3200: 'llvm::DenseMap<llvm::hash_code,ValueT,llvm::DenseMapInfo<llvm::hash_code,void>,llvm::detail::DenseMapPair<KeyT,ValueT>>': invalid template argument for template parameter 'MapT', expected a class template
        with
        [
            ValueT=llvm::sampleprof::FunctionSamples,
            KeyT=llvm::hash_code
        ]

dyung added a reverting change: rGc9a8a0e8a9b2: Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build….Jun 23 2023, 5:59 PM

huangjd mentioned this in D153692: [llvm-profdata] Remove MD5 collision check in D147740.Jun 24 2023, 12:43 AM

huangjd added a child revision: D153692: [llvm-profdata] Remove MD5 collision check in D147740.Jun 24 2023, 12:46 AM

huangjd reopened this revision.Jun 26 2023, 3:24 PM

This revision is now accepted and ready to land.Jun 26 2023, 3:24 PM

Fixed build error in MSVC

Harbormaster completed remote builds in B241321: Diff 534765.Jun 26 2023, 4:00 PM

This revision was landed with ongoing or failed builds.Jun 26 2023, 5:06 PM

Closed by commit rG12e9c7aaa66b: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed… (authored by huangjd). · Explain Why

This revision was automatically updated to reflect the committed changes.

huangjd added a commit: rG12e9c7aaa66b: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed….

MaskRay mentioned this in rG6d871eb95604: [llvm-profdata][unittest] Fix -Wsign-compare after D147740.Jun 26 2023, 6:05 PM

This seems to have broken https://lab.llvm.org/buildbot/#/builders/245/builds/10311

Kindly revert or address the issues.

This revision is now accepted and ready to land.Jun 27 2023, 1:47 AM

hokein added a reverting change: rG58056ae29963: Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build….Jun 27 2023, 6:20 AM

This patch causes regressions on Solaris/sparcv9:

+  LLVM :: Transforms/SampleProfile/ctxsplit.ll
+  LLVM :: Transforms/SampleProfile/indirect-call.ll
+  LLVM :: Transforms/SampleProfile/profile-format.ll
+  LLVM :: tools/llvm-profdata/sample-nametable.test
+  LLVM-Unit :: ProfileData/./ProfileDataTests/26/194

All of the fail in the same way (using indirect-call.ll as an example):

Stack dump:
0.      Program arguments: /var/llvm/dist-sparcv9-release-stage2-A-flang/tools/clang/stage2-bins/bin/opt -S /vol/llvm/src/llvm-project/dist/llvm/test/Transforms/SampleProfile/indirect-call.ll -passes=sample-profile -sample-profile-file=/vol/llvm/src/llvm-project/dist/llvm/test/Transforms/SampleProfile/Inputs/indirect-call.extbinary.afdo
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  opt       0x00000001063410ec llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 36
1  opt       0x00000001063419d8 SignalHandler(int) + 896
2  libc.so.1 0xffffffff7eec4f00 __sighndlr + 12
3  libc.so.1 0xffffffff7eeb77a8 call_user_handler + 1024
4  libc.so.1 0xffffffff7eeb7b68 sigacthandler + 160
5  opt       0x000000010726a738 llvm::sampleprof::SampleProfileReaderBinary::readSampleContextFromTable() + 556
6  opt       0x000000010726e370 llvm::sampleprof::SampleProfileReaderExtBinaryBase::readFuncOffsetTable() + 680
7  opt       0x000000010726bb88 llvm::sampleprof::SampleProfileReaderExtBinaryBase::readOneSection(unsigned char const*, unsigned long, llvm::sampleprof::SecHdrTableEntry const&) + 484
8  opt       0x000000010726f1bc llvm::sampleprof::SampleProfileReaderExtBinaryBase::readImpl() + 212
9  opt       0x0000000106d31128 llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 2172
10 opt       0x0000000105c94770 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 480
11 opt       0x000000010373b650 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool) + 11052
12 opt       0x0000000103749cf0 main + 8592
13 opt       0x0000000103738544 _start + 100
FileCheck error: '<stdin>' is empty.

truss (the Solaris syscall tracer) shows

22092:	    Incurred fault #5, FLTACCESS  %pc = 0x10726A738
22092:	      siginfo: SIGBUS BUS_ADRALN addr=0x1082ED3D3
22092:	    Received signal #10, SIGBUS [caught]
22092:	      siginfo: SIGBUS BUS_ADRALN addr=0x1082ED3D3

i.e. the code tries an unaligned access, which is a no-no on strict-alignment targets like SPARC.

@ro It seems the error you described does not match the error reported in https://lab.llvm.org/buildbot/#/builders/245/builds/10311. The build bot showed a few tests triggering assert on ARM, which I am investigating. However you said the error is about unaligned access on readSampleContextFromTable.

In D147740#4452405, @ro wrote:

...
i.e. the code tries an unaligned access, which is a no-no on strict-alignment targets like SPARC.

It looks like the error happens at line 577 of SampleProfReader.cpp. I changed `hash_code Hash = MD5SampleContextStart[Idx]`
into

hash_code Hash =
      support::endian::read<hash_code, support::little, support::unaligned>(
          MD5SampleContextStart + Idx);

Since MD5SampleContextStart can point to an unaligned position in the actual profile, if I understand it correctly, although I am not sure if this is a bug on SPARC code gen by itself since MD5SampleContextStart[Idx] is a valid C++ expression even if MD5SampleContextStart is not aligned, so the compiler should generate the correct code.

I don't have a sparc machine, so I would like to find someone to test it.

Update SampleProfReader for unaligned access on SPARC

can you repro the previous failure with the alignment sanitizer? then if that goes away with your fix it should be good

Harbormaster completed remote builds in B243630: Diff 537945.Jul 6 2023, 8:55 PM

In D147740#4479339, @aeubanks wrote:

can you repro the previous failure with the alignment sanitizer? then if that goes away with your fix it should be good

I built it with -DLLVM_USE_SANITIZER=Memory and didn't see any (new) issue on X86, but I don't have a SPARC machine available to test on it. Is there another flag I need to use?

In D147740#4482560, @huangjd wrote:

In D147740#4479339, @aeubanks wrote:

can you repro the previous failure with the alignment sanitizer? then if that goes away with your fix it should be good

I built it with -DLLVM_USE_SANITIZER=Memory and didn't see any (new) issue on X86, but I don't have a SPARC machine available to test on it. Is there another flag I need to use?

alignment checking is a ubsan thing, not a msan thing, so it should be -DLLVM_USE_SANITIZER=Undefined. I believe that should enable alignment checking: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#available-checks

In D147740#4485824, @aeubanks wrote:

In D147740#4482560, @huangjd wrote:

In D147740#4479339, @aeubanks wrote:

can you repro the previous failure with the alignment sanitizer? then if that goes away with your fix it should be good

I built it with -DLLVM_USE_SANITIZER=Memory and didn't see any (new) issue on X86, but I don't have a SPARC machine available to test on it. Is there another flag I need to use?

alignment checking is a ubsan thing, not a msan thing, so it should be -DLLVM_USE_SANITIZER=Undefined. I believe that should enable alignment checking: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#available-checks

That option does not work for me. Note that -DLLVM_USE_SANITIZER is an option in CMake configuration, but the list you gave is compiler options. I got the error cc: error: unrecognized argument to ‘-fno-sanitize=’ option: ‘function’ when DLLVM_USE_SANITIZER is set to anything other than Memory. Could you show the exact command to build and run all tests with said sanitizer enabled?

In D147740#4486533, @huangjd wrote:

In D147740#4485824, @aeubanks wrote:

In D147740#4482560, @huangjd wrote:

In D147740#4479339, @aeubanks wrote:

can you repro the previous failure with the alignment sanitizer? then if that goes away with your fix it should be good

I built it with -DLLVM_USE_SANITIZER=Memory and didn't see any (new) issue on X86, but I don't have a SPARC machine available to test on it. Is there another flag I need to use?

alignment checking is a ubsan thing, not a msan thing, so it should be -DLLVM_USE_SANITIZER=Undefined. I believe that should enable alignment checking: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#available-checks

That option does not work for me. Note that -DLLVM_USE_SANITIZER is an option in CMake configuration, but the list you gave is compiler options. I got the error cc: error: unrecognized argument to ‘-fno-sanitize=’ option: ‘function’ when DLLVM_USE_SANITIZER is set to anything other than Memory. Could you show the exact command to build and run all tests with said sanitizer enabled?

cmake -S llvm -B build/cmake -GNinja -DLLVM_TARGETS_TO_BUILD=X86 -DCMAKE_BUILD_TYPE=Release -DLLVM_USE_SANITIZER=Undefined -DLLVM_ENABLE_LLD=ON -DCMAKE_C_COMPILER=$HOME/repos/chromium/src/third_party/llvm-build/Release+Asserts/bin/clang -DCMAKE_CXX_COMPILER=$HOME/repos/chromium/src/third_party/llvm-build/Release+Asserts/bin/clang++ works for me. I'm using Chrome's toolchain which is a very close to ToT clang package, that error message seems to imply that the host compiler you're using isn't a recent clang.

updated test to fix sign comparison warning

@ro I do not have access to SPARC machine, is there a way I can get a definite testing result on it, instead of adding more alignment checks and testing it on X86?

In D147740#4451540, @omjavaid wrote:

This seems to have broken https://lab.llvm.org/buildbot/#/builders/245/builds/10311

Kindly revert or address the issues.

I tested the latest diff on ARM and could not replicate the errors.

Harbormaster completed remote builds in B244641: Diff 539339.Jul 11 2023, 10:28 PM

In D147740#4491943, @huangjd wrote:

@ro I do not have access to SPARC machine, is there a way I can get a definite testing result on it, instead of adding more alignment checks and testing it on X86?

You actually do have access to SPARC machines, both Solaris and Linux: the GCC farm provides just that.

That said, I've tested the previous version of your patch on Solaris/sparcv9. While 4 of the failures I'd reported are gone, LLVM :: Transforms/SampleProfile/profile-format.ll still FAILs, now with

Assertion failed: Hash == hash_value(Key), file /vol/llvm/src/llvm-project/local/llvm/include/llvm/ProfileData/SampleProf.h, line 1352
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /var/llvm/local-sparcv9-release-stage2-A-flang/tools/clang/stage2-bins/bin/opt -passes=sample-profile -sample-profile-file=/vol/llvm/src/llvm-project/local/llvm/test/Transforms/SampleProfile/Inputs/inline.extbinary.afdo -S
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  opt       0x00000001063a8648 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 36
1  opt       0x00000001063a8f34 SignalHandler(int) + 896
2  libc.so.1 0xffffffff7eec4f00 __sighndlr + 12
3  libc.so.1 0xffffffff7eeb77a8 call_user_handler + 1024
4  libc.so.1 0xffffffff7eeb7b98 sigacthandler + 208
5  libc.so.1 0xffffffff7eec9fc0 __lwp_sigqueue + 8
6  libc.so.1 0xffffffff7ede484c abort + 180
7  libc.so.1 0xffffffff7ede5680 _assert + 96
8  opt       0x0000000106dac008 std::pair<llvm::DenseMapIterator<llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>, false>, bool> llvm::sampleprof::HashKeyMap<llvm::DenseMap, llvm::sampleprof::SampleContext, llvm::sampleprof::FunctionSamples>::try_emplace<llvm::sampleprof::FunctionSamples>(llvm::hash_code const&, llvm::sampleprof::SampleContext const&, llvm::sampleprof::FunctionSamples&&) + 396
9  opt       0x00000001072dfe40 llvm::sampleprof::SampleProfileReaderBinary::readFuncProfile(unsigned char const*) + 284
10 opt       0x00000001072e25d0 llvm::sampleprof::SampleProfileReaderExtBinaryBase::readFuncProfiles() + 3476
11 opt       0x00000001072e02f4 llvm::sampleprof::SampleProfileReaderExtBinaryBase::readOneSection(unsigned char const*, unsigned long, llvm::sampleprof::SecHdrTableEntry const&) + 384
12 opt       0x00000001072e3994 llvm::sampleprof::SampleProfileReaderExtBinaryBase::readImpl() + 212
13 opt       0x0000000106da3210 llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 2172
4 opt       0x0000000105cf7ca8 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 480
15 opt       0x0000000103769858 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) + 11060
16 opt       0x0000000103777f5c main + 8692
17 opt       0x0000000103766744 _start + 100

In addition, quite a number of other tests now FAIL with the same assertiion failure:

LLVM :: Transforms/SampleProfile/compressed-profile-symbol-list.ll
LLVM :: Transforms/SampleProfile/csspgo-import-list.ll
LLVM :: Transforms/SampleProfile/csspgo-inline-icall.ll
LLVM :: Transforms/SampleProfile/csspgo-inline.ll
LLVM :: Transforms/SampleProfile/fsafdo_test.ll
LLVM :: Transforms/SampleProfile/inline-mergeprof.ll
LLVM :: Transforms/SampleProfile/profile-context-tracker.ll
LLVM :: Transforms/SampleProfile/profile-format-compress.ll
LLVM :: Transforms/SampleProfile/profile-format.ll
LLVM :: Transforms/SampleProfile/profile-sample-accurate.ll
LLVM :: Transforms/SampleProfile/pseudo-probe-inline.ll
LLVM :: Transforms/SampleProfile/remap.ll
LLVM :: Transforms/SampleProfile/uncompressed-profile-symbol-list.ll
LLVM :: Transforms/SampleProfile/uniqname.ll

I've since retried with the latest version of the patch, but the failures remain.

Another question regarding ARM and SPARC, what is size_t on these machines? I suspect the error has something to do with that, since I couldn't replicate them on X64.

In D147740#4495119, @huangjd wrote:

Another question regarding ARM and SPARC, what is size_t on these machines? I suspect the error has something to do with that, since I couldn't replicate them on X64.

Solaris <iso/stddef_iso.h> has

#if defined(_LP64) || defined(_I32LPx) 
typedef unsigned long   size_t;         /* size of something in bytes */
#else
typedef unsigned int    size_t;         /* (historical version) */ 
#endif

Nothing unusual here.

I still suspect that this is rather an endianess issue

In SampleProfileWriter, prevent already hashed function being hashed again when writing the profile

Harbormaster completed remote builds in B244958: Diff 539808.Jul 12 2023, 7:09 PM

In D147740#4495780, @huangjd wrote:

In SampleProfileWriter, prevent already hashed function being hashed again when writing the profile

This didn't make difference, unfortunately.

In D147740#4495250, @ro wrote:
In D147740#4495119, @huangjd wrote:

Another question regarding ARM and SPARC, what is size_t on these machines? I suspect the error has something to do with that, since I couldn't replicate them on X64.

Solaris <iso/stddef_iso.h> has
#if defined(_LP64) || defined(_I32LPx) 
typedef unsigned long   size_t;         /* size of something in bytes */
#else
typedef unsigned int    size_t;         /* (historical version) */ 
#endif
Nothing unusual here.

I still suspect that this is rather an endianess issue

I am waiting for the approval to use GCC farm machine so that I can debug it. Do you know any person I can ping to expedite the process?

In D147740#4498697, @huangjd wrote:

I am waiting for the approval to use GCC farm machine so that I can debug it. Do you know any person I can ping to expedite the process?

Unfortunately not: sometimes they're very quick, at others a response takes days (or even weeks).

Fix MD5 hash table write on big endian

Always use little endian write on Line 586 in SampleProfileReader.cpp, this should make it consistent on different systems

@huangjd any follow up on simplifying the implantation based on the assumption that collision is non-issue? Just want to make sure the comments on D153692 don't fall through the cracks.

Harbormaster completed remote builds in B245245: Diff 540214.Jul 13 2023, 7:34 PM

huangjd added a child revision: D155257: [llvm-profdata] Changed SampleProfWriter to take a range of of NameFunctionSamples.Jul 13 2023, 8:07 PM

In D147740#4499375, @huangjd wrote:

Always use little endian write on Line 586 in SampleProfileReader.cpp, this should make it consistent on different systems

With the last revision, the assertion failures are gone. However, two tailures do remain:

+  LLVM :: Transforms/SampleProfile/inline-mergeprof.ll

Command Output (stderr):
--
/vol/llvm/src/llvm-project/local/llvm/test/Transforms/SampleProfile/profile-format.ll:27:10: error: CHECK: expected string not found in input
; CHECK: br i1 %cmp, label %while.body, label %while.end{{.*}} !prof ![[IDX1:[0-9]*]]
         ^
<stdin>:1:1: note: scanning from here
; ModuleID = '<stdin>'
^
<stdin>:32:2: note: possible intended match here
 br i1 %cmp, label %while.body, label %while.end, !dbg !42
 ^

+  LLVM :: Transforms/SampleProfile/profile-format.ll

Command Output (stderr):
--
/vol/llvm/src/llvm-project/local/llvm/test/Transforms/SampleProfile/profile-format.ll:27:10: error: CHECK: expected string not found in input
; CHECK: br i1 %cmp, label %while.body, label %while.end{{.*}} !prof ![[IDX1:[0-9]*]]
         ^
<stdin>:1:1: note: scanning from here
; ModuleID = '<stdin>'
^
<stdin>:32:2: note: possible intended match here
 br i1 %cmp, label %while.body, label %while.end, !dbg !42
 ^

In D147740#4499606, @wenlei wrote:

@huangjd any follow up on simplifying the implantation based on the assumption that collision is non-issue? Just want to make sure the comments on D153692 don't fall through the cracks.

I am actually going to change that patch to removing the collision check

@ro
I am using the SPARC machine on GCC farm but it seems to hang on linking the unit tests. Is this normal? Is there an easy option to cross build all the binaries from a X86 machine targeting SPARC and send the file over?

What command did you use to build and test on SPARC?

In D147740#4529339, @huangjd wrote:

@ro
I am using the SPARC machine on GCC farm but it seems to hang on linking the unit tests. Is this normal? Is there an easy option to cross build all the binaries from a X86 machine targeting SPARC and send the file over?

Which cfarm system did you use for the builds? There are several, both Solaris and Linux. I've avoided the Linux ones since they could be unreliable at times. I'd suggest going for gcc106, the only Solaris 11.4 box which is required for LLVM. There are two issues to be wary about:

Be careful to control the parallellism: the link steps can be quite memory intensive (for comparison's sake, my Solaris 11.4/SPARC box used to have 256 GB RAM and is now at 512 GB which helped tremendously).
If you are doing a debug build, the linking is painfully slow indeed (on the order of hours in some cases). Nothing to do about that but being patient.

I've never done cross builds and would expect them to be a real PITA, especially since clang only supports the Solaris linker right now, and GNU ld is not command-line compatible.

What command did you use to build and test on SPARC?

I'm using a script of my own to do the cmake invocation, for a considerable part handling the 2-stage build. Nothing really special in there, though.

Fix endianess issue on SPARC

Harbormaster completed remote builds in B248412: Diff 544567.Jul 26 2023, 10:24 PM

This revision was landed with ongoing or failed builds.Jul 27 2023, 4:08 PM

Closed by commit rG66ba71d913df: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed… (authored by huangjd). · Explain Why

This revision was automatically updated to reflect the committed changes.

huangjd added a commit: rG66ba71d913df: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed….

vvereschaka removed a subscriber: vvereschaka.Jul 27 2023, 4:09 PM

Hi, I think this patch broke clang-armv8-quick bot : https://lab.llvm.org/buildbot/#/builders/245/builds/11732
Could you please take a look ?

In D147740#4537255, @huangjd wrote:

Fix endianess issue on SPARC

Thanks: Solaris/sparcv9 results with that revision are fine now indeed.

I think this patch broke clang-armv8-quick bot

Some context here, that's 32 bit Armv8. Our Armv7 32 bit bots aren't happy with it either.

aaron.ballman added a reverting change: rG1a53b5c367b5: Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build….Jul 28 2023, 6:42 AM

In D147740#4541350, @antmo wrote:

Hi, I think this patch broke clang-armv8-quick bot : https://lab.llvm.org/buildbot/#/builders/245/builds/11732
Could you please take a look ?

This broke quite a few of the ARM bots, so I've reverted in https://github.com/llvm/llvm-project/commit/1a53b5c367b5ebf7d7f34afaa653ea337982f1d6 to hopefully get them back to green while you investigate. Sorry for any troubles!

FWIW This was also crashing our x86 builds (using -flto=thin and -fprofile-sample-use). Asan reports a use-after-free: https://reviews.llvm.org/F28484911

In case its relevant: Those crashing builds use profile files created with slightly older versions of LLVM. I assume backwards compatibility is provided with this patch or at least some versioning check and proper error message if file format is incompatible?

In D147740#4542922, @MatzeB wrote:

In case its relevant: Those crashing builds use profile files created with slightly older versions of LLVM. I assume backwards compatibility is provided with this patch or at least some versioning check and proper error message if file format is incompatible?

Do you have the profile available for me to test?

Do you have the profile available for me to test?

The profiles are hundreds of megabytes and I am not sure whether I am even allowed to make them public. Though I can test changes or inspect the profiles if it helps. Otherwise I can try whether I can manage to create a smaller reproducer on monday.

huangjd reopened this revision.Aug 1 2023, 2:38 PM

This revision is now accepted and ready to land.Aug 1 2023, 2:38 PM

huangjd updated this revision to Diff 546232.Aug 1 2023, 2:38 PM

This comment was removed by huangjd.

Harbormaster completed remote builds in B249606: Diff 546232.Aug 1 2023, 4:29 PM

Fixed more errors on 32-bit platform, always use uint64_t for function hash value

Harbormaster completed remote builds in B249654: Diff 546296.Aug 1 2023, 8:17 PM

In D147740#4542317, @aaron.ballman wrote:

In D147740#4541350, @antmo wrote:

Hi, I think this patch broke clang-armv8-quick bot : https://lab.llvm.org/buildbot/#/builders/245/builds/11732
Could you please take a look ?

This broke quite a few of the ARM bots, so I've reverted in https://github.com/llvm/llvm-project/commit/1a53b5c367b5ebf7d7f34afaa653ea337982f1d6 to hopefully get them back to green while you investigate. Sorry for any troubles!

Are you able to verify that the current patch has resolved the problem? I do not have a 32-bit ARM machine, and GCC farm does not provide one either. I am not sure how to test it on buildbot without actually submitting it

@MatzeB

After some investigation it looks like it's an independent bug in Transforms/IPO/SampleProfile.cpp that has never been discovered.

In SampleProfile.cpp, non-inlined callees are added to the Profiles as new functions, which can trigger a rehashing and invalidates all iterator, including SampleProfileLoader::SampleProfileLoaderBaseImpl::Samples which is a pointer to the current function's FunctionSamples inside the profile. This pointer is later used so it is already undefined behavior.

LLVM's standard library implementation was able to expand the profile (originaly unordered_map) in place without relocating objects, so the code would work. In my patch I changed the container to llvm::DenseMap, which will always relocate objects on rehashing, causing the pointer to be invalid and crashing the program. Since std::unordered_map does not guarantee that either, this function is UB and needs to be rewritten

I am going to submit a patch to that first, although I am not able to create a small test case for that bug since std::unordered_map is doing well to avoid relocating objects as long as there is enough memory, so it would require more careful code review.

Thanks for looking into this! Sounds like we may need unreasonably large inputs so going without a test should be okay. Either as a separate patch or merged with this one, whatever works best for you.

huangjd mentioned this in D157061: [SampleProfile] Potential use after move in SampleProfileLoader::promoteMergeNotInlinedContextSamples.Aug 3 2023, 5:50 PM

huangjd added a parent revision: D157061: [SampleProfile] Potential use after move in SampleProfileLoader::promoteMergeNotInlinedContextSamples.Aug 3 2023, 5:50 PM

In D147740#4558131, @MatzeB wrote:

Thanks for looking into this! Sounds like we may need unreasonably large inputs so going without a test should be okay. Either as a separate patch or merged with this one, whatever works best for you.

The fix is in D157061

@antmo @DavidSpickett @aaron.ballman
Are you able to check if the latest version fixed the issues on ARM-32? I do not have access to ARM-32 machine but was able to reproduce the same failing test cases in Win32 and fixed it. I would like to confirm it's working before landing the patch again

In D147740#4570400, @huangjd wrote:

@antmo @DavidSpickett @aaron.ballman
Are you able to check if the latest version fixed the issues on ARM-32? I do not have access to ARM-32 machine but was able to reproduce the same failing test cases in Win32 and fixed it. I would like to confirm it's working before landing the patch again

I don't have access to an ARM machine myself; I only noticed the issue through the build farm. If you think the code is working, I think it's fine to land it again to see if the bots agree -- it's pretty normal to not have access to all the same hardware as the bots, so speculative commits are somewhat routine.

Hi @huangjd, the 8 failures shown by the bot clang-armv8-quick look fixed in the latest version.

I think it's fine to land it again to see if the bots agree

Yes, totally fine to do this if you're around to revert promptly.

Thanks Antoine for testing the patch!

Rebase D157061

Harbormaster completed remote builds in B252813: Diff 550561.Aug 15 2023, 10:35 PM

huangjd mentioned this in rGda2855c0bad0: [SampleProfile] Potential use after move in SampleProfileLoader….Aug 16 2023, 1:32 PM

This revision was landed with ongoing or failed builds.Aug 17 2023, 1:11 PM

Closed by commit rG7624de5beae2: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed… (authored by huangjd). · Explain Why

This revision was automatically updated to reflect the committed changes.

huangjd added a commit: rG7624de5beae2: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed….

This diff is causing clang to crash on one of our builds, when clang is linked with jemalloc.
-DCMAKE_EXE_LINKER_FLAGS="-L /usr/lib64 -Wl,-rpath,/usr/lib64 -ljemalloc" \

I am trying to come up with small repro, but in meantime can this diff be reverted?

Full stack trace:

 #0 0x000055b18add6a58 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x000055b18add4950 llvm::sys::RunSignalHandlers() /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x000055b18add722d SignalHandler(int) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x00007ffaa7812cf0 __restore_rt (/usr/lib64/libpthread.so.0+0x12cf0)
 #4 0x000055b18c0c9033 llvm::sampleprof::LineLocation::operator<(llvm::sampleprof::LineLocation const&) const /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:296:23
 #5 0x000055b18c0c9033 std::less<llvm::sampleprof::LineLocation>::operator()(llvm::sampleprof::LineLocation const&, llvm::sampleprof::LineLocation const&) const /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_function.h:408:20
 #6 0x000055b18c0c9033 std::_Rb_tree<llvm::sampleprof::LineLocation, std::pair<llvm::sampleprof::LineLocation const, llvm::sampleprof::SampleRecord>, std::_Select1st<std::pair<llvm::sampleprof::LineLocation const, llvm::sampleprof::SampleRecord>>, std::less<llvm::sampleprof::LineLocation>, std::allocator<std::pair<llvm::sampleprof::LineLocation const, llvm::sampleprof::SampleRecord>>>::_M_get_insert_hint_unique_pos(std::_Rb_tree_const_iterator<std::pair<llvm::sampleprof::LineLocation const, llvm::sampleprof::SampleRecord>>, llvm::sampleprof::LineLocation const&) /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_tree.h:2220:11
 #7 0x000055b18c0caf2d std::_Rb_tree_iterator<std::pair<llvm::sampleprof::LineLocation const, llvm::sampleprof::SampleRecord>> std::_Rb_tree<llvm::sampleprof::LineLocation, std::pair<llvm::sampleprof::LineLocation const, llvm::sampleprof::SampleRecord>, std::_Select1st<std::pair<llvm::sampleprof::LineLocation const, llvm::sampleprof::SampleRecord>>, std::less<llvm::sampleprof::LineLocation>, std::allocator<std::pair<llvm::sampleprof::LineLocation const, llvm::sampleprof::SampleRecord>>>::_M_emplace_hint_unique<std::piecewise_construct_t const&, std::tuple<llvm::sampleprof::LineLocation&&>, std::tuple<>>(std::_Rb_tree_const_iterator<std::pair<llvm::sampleprof::LineLocation const, llvm::sampleprof::SampleRecord>>, std::piecewise_construct_t const&, std::tuple<llvm::sampleprof::LineLocation&&>&&, std::tuple<>&&) /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_tree.h:2462:15
 #8 0x000055b18c0c9bfb llvm::sampleprof::FunctionSamples::addBodySamples(unsigned int, unsigned int, unsigned long, unsigned long) /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_map.h:0:15
 #9 0x000055b18c0c6d27 llvm::sampleprof::ProfileConverter::flattenNestedProfile(llvm::sampleprof::SampleProfileMap&, llvm::sampleprof::FunctionSamples const&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:1599:21
#10 0x000055b18c0c659f llvm::sampleprof::ProfileConverter::flattenProfile(llvm::sampleprof::SampleProfileMap const&, llvm::sampleprof::SampleProfileMap&, bool) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:0:9
#11 0x000055b18c0c0e28 std::__uniq_ptr_impl<(anonymous namespace)::SampleProfileMatcher, std::default_delete<(anonymous namespace)::SampleProfileMatcher>>::reset((anonymous namespace)::SampleProfileMatcher*) /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/unique_ptr.h:200:26
#12 0x000055b18c0c0e28 std::__uniq_ptr_impl<(anonymous namespace)::SampleProfileMatcher, std::default_delete<(anonymous namespace)::SampleProfileMatcher>>::operator=(std::__uniq_ptr_impl<(anonymous namespace)::SampleProfileMatcher, std::default_delete<(anonymous namespace)::SampleProfileMatcher>>&&) /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/unique_ptr.h:183:2
#13 0x000055b18c0c0e28 std::__uniq_ptr_data<(anonymous namespace)::SampleProfileMatcher, std::default_delete<(anonymous namespace)::SampleProfileMatcher>, true, true>::operator=(std::__uniq_ptr_data<(anonymous namespace)::SampleProfileMatcher, std::default_delete<(anonymous namespace)::SampleProfileMatcher>, true, true>&&) /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/unique_ptr.h:235:61
#14 0x000055b18c0c0e28 std::unique_ptr<(anonymous namespace)::SampleProfileMatcher, std::default_delete<(anonymous namespace)::SampleProfileMatcher>>::operator=(std::unique_ptr<(anonymous namespace)::SampleProfileMatcher, std::default_delete<(anonymous namespace)::SampleProfileMatcher>>&&) /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/unique_ptr.h:406:51
#15 0x000055b18c0c0e28 (anonymous namespace)::SampleProfileLoader::doInitialization(llvm::Module&, llvm::AnalysisManager<llvm::Function>*) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2104:21
#16 0x000055b18c0c0e28 llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2632:21
#17 0x000055b18bfcfb0d llvm::detail::PassModel<llvm::Module, llvm::SampleProfileLoaderPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:89:5
#18 0x000055b18a95e3b9 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/IR/PassManager.h:521:10
#19 0x000055b18b4f33ff llvm::SmallPtrSetImplBase::isSmall() const /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:195:33
#20 0x000055b18b4f33ff llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:83:10
#21 0x000055b18b4f33ff llvm::PreservedAnalyses::~PreservedAnalyses() /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/IR/PassManager.h:152:7
#22 0x000055b18b4f33ff (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1101:5
#23 0x000055b18b4eaa22 (anonymous namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:0:3
#24 0x000055b18b4eaa22 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1321:13
#25 0x000055b18b95a8f5 std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>::~unique_ptr() /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/unique_ptr.h:395:6
#26 0x000055b18b95a8f5 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:386:7
#27 0x000055b18cf852c6 __gnu_cxx::__normal_iterator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>*, std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>>::__normal_iterator(std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>* const&) /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_iterator.h:1073:20
#28 0x000055b18cf852c6 std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>::begin() /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_vector.h:869:16
#29 0x000055b18cf852c6 void clang::finalize<std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>>(std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>&, clang::Sema const&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/include/clang/Sema/TemplateInstCallback.h:54:16
#30 0x000055b18cf852c6 clang::ParseAST(clang::Sema&, bool, bool) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Parse/ParseAST.cpp:183:3
#31 0x000055b18b881b10 clang::FrontendAction::Execute() /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1067:10
#32 0x000055b18b7fdc3d llvm::Error::getPtr() const /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/Support/Error.h:270:42
#33 0x000055b18b7fdc3d llvm::Error::operator bool() /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/Support/Error.h:233:16
#34 0x000055b18b7fdc3d clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1054:23
#35 0x000055b18b953335 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:272:25
#36 0x000055b189dc769c cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/cc1_main.cpp:249:15
#37 0x000055b189dc479a ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/driver.cpp:366:12
#38 0x000055b189dc3732 clang_main(int, char**, llvm::ToolContext const&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/driver.cpp:407:12
#39 0x000055b189dd2d71 main /home/ayermolo/local/llvm-build-upstream-release/tools/clang/tools/driver/clang-driver.cpp:15:3
#40 0x00007ffaa583ad85 __libc_start_main (/usr/lib64/libc.so.6+0x3ad85)
#41 0x000055b189dc102e _start (/home/ayermolo/local/llvm-build-upstream-release/bin/clang+++0x2a6a02e)

@huangjd

MSAN output. Does this ring any bells?

==3359553==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x564b193d7f4e in llvm::sampleprof::FunctionSamples::getHeadSamplesEstimate() const /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:959:30
    #1 0x564b193ce93a in llvm::sampleprof::ProfileConverter::flattenNestedProfile(llvm::sampleprof::SampleProfileMap&, llvm::sampleprof::FunctionSamples const&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:1612:36
    #2 0x564b193ce582 in llvm::sampleprof::ProfileConverter::flattenNestedProfile(llvm::sampleprof::SampleProfileMap&, llvm::sampleprof::FunctionSamples const&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:1607:9
    #3 0x564b193ca9cc in llvm::sampleprof::ProfileConverter::flattenProfile(llvm::sampleprof::SampleProfileMap const&, llvm::sampleprof::SampleProfileMap&, bool) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:1558:9
    #4 0x564b1936a0f6 in (anonymous namespace)::SampleProfileMatcher::SampleProfileMatcher(llvm::Module&, llvm::sampleprof::SampleProfileReader&, llvm::PseudoProbeManager const*) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:466:7
    #5 0x564b1936a0f6 in std::__1::__unique_if<(anonymous namespace)::SampleProfileMatcher>::__unique_single std::__1::make_unique[abi:v180000]<(anonymous namespace)::SampleProfileMatcher, llvm::Module&, llvm::sampleprof::SampleProfileReader&, llvm::PseudoProbeManager*>(llvm::Module&, llvm::sampleprof::SampleProfileReader&, llvm::PseudoProbeManager*&&) /home/ayermolo/local/upstream-llvm/llvm-project/build_libcxx/include/c++/v1/__memory/unique_ptr.h:685:30
    #6 0x564b1936a0f6 in (anonymous namespace)::SampleProfileLoader::doInitialization(llvm::Module&, llvm::AnalysisManager<llvm::Function>*) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2105:9
    #7 0x564b1936a0f6 in llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2632:21
    #8 0x564b18f338ac in llvm::detail::PassModel<llvm::Module, llvm::SampleProfileLoaderPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:89:17
    #9 0x564b12a1306b in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/IR/PassManager.h:517:40
    #10 0x564b15b5e1a7 in (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>&, std::__1::unique_ptr<llvm::ToolOutputFile, std::__1::default_delete<llvm::ToolOutputFile>>&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1101:9
    #11 0x564b15b447fd in (anonymous namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1158:3
    #12 0x564b15b447fd in clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1321:13
    #13 0x564b16f72ca1 in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:386:7
    #14 0x564b1d01a29b in clang::ParseAST(clang::Sema&, bool, bool) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Parse/ParseAST.cpp:176:13
    #15 0x564b16be4f7e in clang::FrontendAction::Execute() /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1063:8
    #16 0x564b1695cd8d in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1054:33
    #17 0x564b16f50383 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:272:25
    #18 0x564b0f3bb128 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/cc1_main.cpp:249:15
    #19 0x564b0f3b0d6a in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/driver.cpp:366:12
    #20 0x564b0f3abb78 in clang_main(int, char**, llvm::ToolContext const&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/driver.cpp:407:12
    #21 0x564b0f3ec39c in main /home/ayermolo/local/llvm-build-upstream-msan-release/tools/clang/tools/driver/clang-driver.cpp:15:10
    #22 0x7f73e6a3ad84 in __libc_start_main (/lib64/libc.so.6+0x3ad84) (BuildId: 1356e140fb964a20b0d2838960ee69ca6faeb034)
    #23 0x564b0f31642d in _start (/data/users/ayermolo/llvm-build-upstream-msan-release/bin/clang-18+0x315042d)

  Uninitialized value was created by a heap deallocation
    #0 0x564b0f3a2269 in operator delete(void*, std::align_val_t) /home/ayermolo/local/upstream-llvm/llvm-project/compiler-rt/lib/msan/msan_new_delete.cpp:90:3
    #1 0x564b193d94e5 in llvm::DenseMapBase<llvm::DenseMap<llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>, llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>::grow(unsigned int) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/DenseMap.h:564:36
    #2 0x564b193d94e5 in llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>* llvm::DenseMapBase<llvm::DenseMap<llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>, llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>::InsertIntoBucketImpl<llvm::hash_code>(llvm::hash_code const&, llvm::hash_code const&, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>*) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/DenseMap.h
    #3 0x564b193d94e5 in llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>* llvm::DenseMapBase<llvm::DenseMap<llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>, llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>::InsertIntoBucket<llvm::hash_code const&, llvm::sampleprof::FunctionSamples const&>(llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>*, llvm::hash_code const&, llvm::sampleprof::FunctionSamples const&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/DenseMap.h:574:17

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:959:30 in llvm::sampleprof::FunctionSamples::getHeadSamplesEstimate() const
Exiting

What were you building when this error happened?

In D147740#4608331, @ayermolo wrote:

@huangjd

MSAN output. Does this ring any bells?

==3359553==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x564b193d7f4e in llvm::sampleprof::FunctionSamples::getHeadSamplesEstimate() const /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:959:30
    #1 0x564b193ce93a in llvm::sampleprof::ProfileConverter::flattenNestedProfile(llvm::sampleprof::SampleProfileMap&, llvm::sampleprof::FunctionSamples const&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:1612:36
    #2 0x564b193ce582 in llvm::sampleprof::ProfileConverter::flattenNestedProfile(llvm::sampleprof::SampleProfileMap&, llvm::sampleprof::FunctionSamples const&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:1607:9
    #3 0x564b193ca9cc in llvm::sampleprof::ProfileConverter::flattenProfile(llvm::sampleprof::SampleProfileMap const&, llvm::sampleprof::SampleProfileMap&, bool) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:1558:9
    #4 0x564b1936a0f6 in (anonymous namespace)::SampleProfileMatcher::SampleProfileMatcher(llvm::Module&, llvm::sampleprof::SampleProfileReader&, llvm::PseudoProbeManager const*) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:466:7
    #5 0x564b1936a0f6 in std::__1::__unique_if<(anonymous namespace)::SampleProfileMatcher>::__unique_single std::__1::make_unique[abi:v180000]<(anonymous namespace)::SampleProfileMatcher, llvm::Module&, llvm::sampleprof::SampleProfileReader&, llvm::PseudoProbeManager*>(llvm::Module&, llvm::sampleprof::SampleProfileReader&, llvm::PseudoProbeManager*&&) /home/ayermolo/local/upstream-llvm/llvm-project/build_libcxx/include/c++/v1/__memory/unique_ptr.h:685:30
    #6 0x564b1936a0f6 in (anonymous namespace)::SampleProfileLoader::doInitialization(llvm::Module&, llvm::AnalysisManager<llvm::Function>*) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2105:9
    #7 0x564b1936a0f6 in llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2632:21
    #8 0x564b18f338ac in llvm::detail::PassModel<llvm::Module, llvm::SampleProfileLoaderPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:89:17
    #9 0x564b12a1306b in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/IR/PassManager.h:517:40
    #10 0x564b15b5e1a7 in (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>&, std::__1::unique_ptr<llvm::ToolOutputFile, std::__1::default_delete<llvm::ToolOutputFile>>&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1101:9
    #11 0x564b15b447fd in (anonymous namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1158:3
    #12 0x564b15b447fd in clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1321:13
    #13 0x564b16f72ca1 in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:386:7
    #14 0x564b1d01a29b in clang::ParseAST(clang::Sema&, bool, bool) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Parse/ParseAST.cpp:176:13
    #15 0x564b16be4f7e in clang::FrontendAction::Execute() /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1063:8
    #16 0x564b1695cd8d in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1054:33
    #17 0x564b16f50383 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:272:25
    #18 0x564b0f3bb128 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/cc1_main.cpp:249:15
    #19 0x564b0f3b0d6a in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/driver.cpp:366:12
    #20 0x564b0f3abb78 in clang_main(int, char**, llvm::ToolContext const&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/driver.cpp:407:12
    #21 0x564b0f3ec39c in main /home/ayermolo/local/llvm-build-upstream-msan-release/tools/clang/tools/driver/clang-driver.cpp:15:10
    #22 0x7f73e6a3ad84 in __libc_start_main (/lib64/libc.so.6+0x3ad84) (BuildId: 1356e140fb964a20b0d2838960ee69ca6faeb034)
    #23 0x564b0f31642d in _start (/data/users/ayermolo/llvm-build-upstream-msan-release/bin/clang-18+0x315042d)

  Uninitialized value was created by a heap deallocation
    #0 0x564b0f3a2269 in operator delete(void*, std::align_val_t) /home/ayermolo/local/upstream-llvm/llvm-project/compiler-rt/lib/msan/msan_new_delete.cpp:90:3
    #1 0x564b193d94e5 in llvm::DenseMapBase<llvm::DenseMap<llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>, llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>::grow(unsigned int) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/DenseMap.h:564:36
    #2 0x564b193d94e5 in llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>* llvm::DenseMapBase<llvm::DenseMap<llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>, llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>::InsertIntoBucketImpl<llvm::hash_code>(llvm::hash_code const&, llvm::hash_code const&, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>*) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/DenseMap.h
    #3 0x564b193d94e5 in llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>* llvm::DenseMapBase<llvm::DenseMap<llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>, llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>::InsertIntoBucket<llvm::hash_code const&, llvm::sampleprof::FunctionSamples const&>(llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>*, llvm::hash_code const&, llvm::sampleprof::FunctionSamples const&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/DenseMap.h:574:17

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:959:30 in llvm::sampleprof::FunctionSamples::getHeadSamplesEstimate() const
Exiting

In D147740#4611679, @huangjd wrote:

What were you building when this error happened?

In D147740#4608331, @ayermolo wrote:

@huangjd

MSAN output. Does this ring any bells?

==3359553==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x564b193d7f4e in llvm::sampleprof::FunctionSamples::getHeadSamplesEstimate() const /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:959:30
    #1 0x564b193ce93a in llvm::sampleprof::ProfileConverter::flattenNestedProfile(llvm::sampleprof::SampleProfileMap&, llvm::sampleprof::FunctionSamples const&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:1612:36
    #2 0x564b193ce582 in llvm::sampleprof::ProfileConverter::flattenNestedProfile(llvm::sampleprof::SampleProfileMap&, llvm::sampleprof::FunctionSamples const&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:1607:9
    #3 0x564b193ca9cc in llvm::sampleprof::ProfileConverter::flattenProfile(llvm::sampleprof::SampleProfileMap const&, llvm::sampleprof::SampleProfileMap&, bool) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:1558:9
    #4 0x564b1936a0f6 in (anonymous namespace)::SampleProfileMatcher::SampleProfileMatcher(llvm::Module&, llvm::sampleprof::SampleProfileReader&, llvm::PseudoProbeManager const*) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:466:7
    #5 0x564b1936a0f6 in std::__1::__unique_if<(anonymous namespace)::SampleProfileMatcher>::__unique_single std::__1::make_unique[abi:v180000]<(anonymous namespace)::SampleProfileMatcher, llvm::Module&, llvm::sampleprof::SampleProfileReader&, llvm::PseudoProbeManager*>(llvm::Module&, llvm::sampleprof::SampleProfileReader&, llvm::PseudoProbeManager*&&) /home/ayermolo/local/upstream-llvm/llvm-project/build_libcxx/include/c++/v1/__memory/unique_ptr.h:685:30
    #6 0x564b1936a0f6 in (anonymous namespace)::SampleProfileLoader::doInitialization(llvm::Module&, llvm::AnalysisManager<llvm::Function>*) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2105:9
    #7 0x564b1936a0f6 in llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2632:21
    #8 0x564b18f338ac in llvm::detail::PassModel<llvm::Module, llvm::SampleProfileLoaderPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:89:17
    #9 0x564b12a1306b in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/IR/PassManager.h:517:40
    #10 0x564b15b5e1a7 in (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>&, std::__1::unique_ptr<llvm::ToolOutputFile, std::__1::default_delete<llvm::ToolOutputFile>>&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1101:9
    #11 0x564b15b447fd in (anonymous namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1158:3
    #12 0x564b15b447fd in clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1321:13
    #13 0x564b16f72ca1 in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:386:7
    #14 0x564b1d01a29b in clang::ParseAST(clang::Sema&, bool, bool) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Parse/ParseAST.cpp:176:13
    #15 0x564b16be4f7e in clang::FrontendAction::Execute() /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1063:8
    #16 0x564b1695cd8d in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1054:33
    #17 0x564b16f50383 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:272:25
    #18 0x564b0f3bb128 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/cc1_main.cpp:249:15
    #19 0x564b0f3b0d6a in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/driver.cpp:366:12
    #20 0x564b0f3abb78 in clang_main(int, char**, llvm::ToolContext const&) /home/ayermolo/local/upstream-llvm/llvm-project/clang/tools/driver/driver.cpp:407:12
    #21 0x564b0f3ec39c in main /home/ayermolo/local/llvm-build-upstream-msan-release/tools/clang/tools/driver/clang-driver.cpp:15:10
    #22 0x7f73e6a3ad84 in __libc_start_main (/lib64/libc.so.6+0x3ad84) (BuildId: 1356e140fb964a20b0d2838960ee69ca6faeb034)
    #23 0x564b0f31642d in _start (/data/users/ayermolo/llvm-build-upstream-msan-release/bin/clang-18+0x315042d)

  Uninitialized value was created by a heap deallocation
    #0 0x564b0f3a2269 in operator delete(void*, std::align_val_t) /home/ayermolo/local/upstream-llvm/llvm-project/compiler-rt/lib/msan/msan_new_delete.cpp:90:3
    #1 0x564b193d94e5 in llvm::DenseMapBase<llvm::DenseMap<llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>, llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>::grow(unsigned int) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/DenseMap.h:564:36
    #2 0x564b193d94e5 in llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>* llvm::DenseMapBase<llvm::DenseMap<llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>, llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>::InsertIntoBucketImpl<llvm::hash_code>(llvm::hash_code const&, llvm::hash_code const&, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>*) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/DenseMap.h
    #3 0x564b193d94e5 in llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>* llvm::DenseMapBase<llvm::DenseMap<llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>, llvm::hash_code, llvm::sampleprof::FunctionSamples, llvm::DenseMapInfo<llvm::hash_code, void>, llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>>::InsertIntoBucket<llvm::hash_code const&, llvm::sampleprof::FunctionSamples const&>(llvm::detail::DenseMapPair<llvm::hash_code, llvm::sampleprof::FunctionSamples>*, llvm::hash_code const&, llvm::sampleprof::FunctionSamples const&) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/DenseMap.h:574:17

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:959:30 in llvm::sampleprof::FunctionSamples::getHeadSamplesEstimate() const
Exiting

It's c++ code
"-triple" "x86_64-redhat-linux-gnu" "-emit-llvm-bc" "-flto=thin" "-flto-unit" ... "-fprofile-sample-use=<path>/__autofdo-bolt-compatible__/out/profile" "-fpseudo-probe-for-profiling"
Some of the more relevant options I think.
it's a csspgo profile.

I think I figured out what's the problem, writing a patch now

@ayermolo Please review D158689

huangjd reopened this revision.Aug 28 2023, 12:18 PM

This revision is now accepted and ready to land.Aug 28 2023, 12:18 PM

huangjd mentioned this in D159014: [llvm-profdata] Use std::unordered_map in SampleProfileMap.Aug 28 2023, 1:33 PM

huangjd mentioned this in rG33810543ab3c: [llvm-profdata] Use std::unordered_map in SampleProfileMap.Aug 28 2023, 3:32 PM

huangjd closed this revision.Sep 14 2023, 2:22 PM

huangjd removed a child revision: D152320: [llvm-profdata] Use StringRef in place of string in FunctionSamplesMap.Sep 14 2023, 2:48 PM

GitHub <noreply@github.com> mentioned this in rGf4f85e0ab405: [llvm-profdata] Remove MD5 collision check in D147740 (#66544).Sep 15 2023, 3:31 PM

GitHub <noreply@github.com> mentioned this in rGef0e0adccd94: [llvm-profdata] Do not create numerical strings for MD5 function names read….Oct 17 2023, 2:09 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

ProfileData/

SampleProf.h

212 lines

SampleProfReader.h

40 lines

lib/

ProfileData/

ProfileSummaryBuilder.cpp

4 lines

SampleProf.cpp

74 lines

SampleProfReader.cpp

123 lines

SampleProfWriter.cpp

6 lines

Transforms/

IPO/

SampleContextTracker.cpp

4 lines

test/

tools/

llvm-profdata/

Inputs/

sample-nametable-after-samples.profdata

sample-nametable.test

2 lines

tools/

llvm-profdata/

llvm-profdata.cpp

5 lines

llvm-profgen/

ProfileGenerator.cpp

19 lines

unittests/

tools/

llvm-profdata/

CMakeLists.txt

1 line

MD5CollisionTest.cpp

166 lines

OutputSizeLimitTest.cpp

2 lines

Diff 551240

llvm/include/llvm/ProfileData/SampleProf.h

Show First 20 Lines • Show All 312 Lines • ▼ Show 20 Lines	struct LineLocationHash {
uint64_t operator()(const LineLocation &Loc) const {		uint64_t operator()(const LineLocation &Loc) const {
return std::hash<std::uint64_t>{}((((uint64_t)Loc.LineOffset) << 32) \|		return std::hash<std::uint64_t>{}((((uint64_t)Loc.LineOffset) << 32) \|
Loc.Discriminator);		Loc.Discriminator);
}		}
};		};

raw_ostream &operator<<(raw_ostream &OS, const LineLocation &Loc);		raw_ostream &operator<<(raw_ostream &OS, const LineLocation &Loc);

		static inline uint64_t hashFuncName(StringRef F) {
		davidxlUnsubmitted Not Done Reply Inline Actions F is not really function name, so may be called hashFunc(StringRef NameOrMD5) { ...} davidxl: F is not really function name, so may be called hashFunc(StringRef NameOrMD5) { ...}
		// If function name is already MD5 string, do not hash again.
		uint64_t Hash;
		if (F.getAsInteger(10, Hash))
		snehasishUnsubmitted Not Done Reply Inline Actions Is there a chance that the base 10 encoding may change? Is this the only place where we generate the hashes? snehasish: Is there a chance that the base 10 encoding may change? Is this the only place where we…
		huangjdAuthorUnsubmitted Done Reply Inline Actions No, and doesn't matter. Base-10 encoding is used only because the existing implementation wanted to make it compatible to represent both strings (function names) and integers (MD5), and the solution was to convert MD5 into a base-10 string. This seems inefficient and I am working on a subsequent patch to deal with it. huangjd: No, and doesn't matter. Base-10 encoding is used only because the existing implementation…
		Hash = MD5Hash(F);
		return Hash;
		}

/// Representation of a single sample record.		/// Representation of a single sample record.
///		///
/// A sample record is represented by a positive integer value, which		/// A sample record is represented by a positive integer value, which
/// indicates how frequently was the associated line location executed.		/// indicates how frequently was the associated line location executed.
///		///
/// Additionally, if the associated location contains a function call,		/// Additionally, if the associated location contains a function call,
/// the record will hold a list of all the possible called targets. For		/// the record will hold a list of all the possible called targets. For
/// direct calls, this will be the exact function being invoked. For		/// direct calls, this will be the exact function being invoked. For
▲ Show 20 Lines • Show All 297 Lines • ▼ Show 20 Lines	public:

std::string toString() const {		std::string toString() const {
if (!hasContext())		if (!hasContext())
return Name.str();		return Name.str();
return getContextString(FullContext, false);		return getContextString(FullContext, false);
}		}

uint64_t getHashCode() const {		uint64_t getHashCode() const {
return hasContext() ? hash_value(getContextFrames())		if (hasContext())
: hash_value(getName());		return hash_value(getContextFrames());

		// For non-context function name, use its MD5 as hash value, so that it is
		// consistent with the profile map's key.
		return hashFuncName(getName());
		davidxlUnsubmitted Not Done Reply Inline Actions how often is the name already MD5 string? davidxl: how often is the name already MD5 string?
		huangjdAuthorUnsubmitted Done Reply Inline Actions If the profile is stored as MD5 format, then all of them are MD5 string, otherwise none. This is some logic from before the refactoring, that the MD5 function name is being converted back and forth between a uint64 and string, and I am planning to change it to a union or similar in the next phase huangjd: If the profile is stored as MD5 format, then all of them are MD5 string, otherwise none. This…
}		}

/// Set the name of the function and clear the current context.		/// Set the name of the function and clear the current context.
void setName(StringRef FunctionName) {		void setName(StringRef FunctionName) {
Name = FunctionName;		Name = FunctionName;
FullContext = SampleContextFrames();		FullContext = SampleContextFrames();
State = UnknownContext;		State = UnknownContext;
}		}
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	private:
// Full context including calling context and leaf function name		// Full context including calling context and leaf function name
SampleContextFrames FullContext;		SampleContextFrames FullContext;
// State of the associated sample profile		// State of the associated sample profile
uint32_t State;		uint32_t State;
// Attribute of the associated sample profile		// Attribute of the associated sample profile
uint32_t Attributes;		uint32_t Attributes;
};		};

static inline hash_code hash_value(const SampleContext &arg) {		static inline hash_code hash_value(const SampleContext &Context) {
return arg.hasContext() ? hash_value(arg.getContextFrames())		return Context.getHashCode();
: hash_value(arg.getName());		}

		inline raw_ostream &operator<<(raw_ostream &OS, const SampleContext &Context) {
		return OS << Context.toString();
}		}

class FunctionSamples;		class FunctionSamples;
class SampleProfileReaderItaniumRemapper;		class SampleProfileReaderItaniumRemapper;

using BodySampleMap = std::map<LineLocation, SampleRecord>;		using BodySampleMap = std::map<LineLocation, SampleRecord>;
// NOTE: Using a StringMap here makes parsed profiles consume around 17% more		// NOTE: Using a StringMap here makes parsed profiles consume around 17% more
// memory, which is very significant for large profiles.		// memory, which is very significant for large profiles.
▲ Show 20 Lines • Show All 477 Lines • ▼ Show 20 Lines	return (GUIDToFuncNameMap == Other.GUIDToFuncNameMap \|\|
BodySamples == Other.BodySamples &&		BodySamples == Other.BodySamples &&
CallsiteSamples == Other.CallsiteSamples;		CallsiteSamples == Other.CallsiteSamples;
}		}

bool operator!=(const FunctionSamples &Other) const {		bool operator!=(const FunctionSamples &Other) const {
return !(*this == Other);		return !(*this == Other);
}		}

		template <typename T>
		const T &getKey() const;

private:		private:
/// CFG hash value for the function.		/// CFG hash value for the function.
uint64_t FunctionHash = 0;		uint64_t FunctionHash = 0;

/// Calling context for function profile		/// Calling context for function profile
mutable SampleContext Context;		mutable SampleContext Context;

/// Total number of samples collected inside this function.		/// Total number of samples collected inside this function.
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	private:
/// 2 bar();		/// 2 bar();
/// }		/// }
/// Supposing the stale profile matching algorithm generated the mapping [2 ->		/// Supposing the stale profile matching algorithm generated the mapping [2 ->
/// 1], the profile query using the location of bar on the IR which is 2 will		/// 1], the profile query using the location of bar on the IR which is 2 will
/// be remapped to 1 and find the location of bar in the profile.		/// be remapped to 1 and find the location of bar in the profile.
const LocToLocMap *IRToProfileLocationMap = nullptr;		const LocToLocMap *IRToProfileLocationMap = nullptr;
};		};

		template <>
		inline const SampleContext &FunctionSamples::getKey<SampleContext>() const {
		return getContext();
		}

raw_ostream &operator<<(raw_ostream &OS, const FunctionSamples &FS);		raw_ostream &operator<<(raw_ostream &OS, const FunctionSamples &FS);

using SampleProfileMap =		/// This class is a wrapper to associative container MapT<KeyT, ValueT> using
std::unordered_map<SampleContext, FunctionSamples, SampleContext::Hash>;		/// the hash value of the original key as the new key. This greatly improves the
		/// performance of insert and query operations especially when hash values of
		/// keys are available a priori, and reduces memory usage if KeyT has a large
		/// size.
		/// When performing any action, if an existing entry with a given key is found,
		/// and the interface "KeyT ValueT::getKey<KeyT>() const" to retrieve a value's
		/// original key exists, this class checks if the given key actually matches
		/// the existing entry's original key. If they do not match, this class behaves
		/// as if the entry did not exist (for insertion, this means the new value will
		/// replace the existing entry's value, as if it is newly inserted). If
		/// ValueT::getKey<KeyT>() is not available, all keys with the same hash value
		/// are considered equivalent (i.e. hash collision is silently ignored). Given
		/// such feature this class should only be used where it does not affect
		/// compilation correctness, for example, when loading a sample profile.
		/// Assuming the hashing algorithm is uniform, the probability of hash collision
		/// with 1,000,000 entries is
		/// (2^64)!/((2^64-1000000)!(2^64)^1000000) ~= 310^-8.
		template <template <typename, typename, typename...> typename MapT,
		typename KeyT, typename ValueT, typename... MapTArgs>
		class HashKeyMap : public MapT<hash_code, ValueT, MapTArgs...> {
		public:
		using base_type = MapT<hash_code, ValueT, MapTArgs...>;
		using key_type = hash_code;
		using original_key_type = KeyT;
		using mapped_type = ValueT;
		using value_type = typename base_type::value_type;

		using iterator = typename base_type::iterator;
		using const_iterator = typename base_type::const_iterator;
		davidxlUnsubmitted Done Reply Inline Actions maybe define SampleProfileMap as a wrapper class and then forbid the unintended interfaces like operator[] .. davidxl: maybe define SampleProfileMap as a wrapper class and then forbid the unintended interfaces like…

		private:
		snehasishUnsubmitted Not Done Reply Inline Actions Can this lead to non-deterministic builds? snehasish: Can this lead to non-deterministic builds?
		huangjdAuthorUnsubmitted Done Reply Inline Actions Why? There's no non-determinism here, the existing entry always gets erased first. SampleProfWriter does sort the profile before writing so it's ok huangjd: Why? There's no non-determinism here, the existing entry always gets erased first.
		// If the value type has getKey(), retrieve its original key for comparison.
		template <typename U = mapped_type,
		typename = decltype(U().template getKey<original_key_type>())>
		davidxlUnsubmitted Not Done Reply Inline Actions Since this forces insert in case of conflict, should it be called just 'emplace'? davidxl: Since this forces insert in case of conflict, should it be called just 'emplace'?
		huangjdAuthorUnsubmitted Done Reply Inline Actions try_emplace functionally is same as emplace (https://en.cppreference.com/w/cpp/container/unordered_map/try_emplace), only difference is that try_emplace does not move the arguments to construct a mapped_type if key exists. huangjd: try_emplace functionally is same as emplace (https://en.cppreference.
		static bool
		CheckKeyMatch(const original_key_type &Key, const mapped_type &ExistingValue,
		original_key_type *ExistingKeyIfDifferent = nullptr) {
		const original_key_type &ExistingKey =
		ExistingValue.template getKey<original_key_type>();
		bool Result = (Key == ExistingKey);
		if (!Result && ExistingKeyIfDifferent)
		*ExistingKeyIfDifferent = ExistingKey;
		return Result;
		davidxlUnsubmitted Not Done Reply Inline Actions potential memory leak? Also why inserting an empty FunctionSamples instead of the one passed in? davidxl: potential memory leak? Also why inserting an empty FunctionSamples instead of the one passed…
		huangjdAuthorUnsubmitted Done Reply Inline Actions Fixed. FunctionSamples is copy assignable huangjd: Fixed. FunctionSamples is copy assignable
		}

		// If getKey() does not exist, this overload is selected, which assumes all
		// keys with the same hash are equivalent.
		static bool CheckKeyMatch(...) { return true; }

		public:
		template <typename... Ts>
		std::pair<iterator, bool> try_emplace(const key_type &Hash,
		const original_key_type &Key,
		Ts &&...Args) {
		assert(Hash == hash_value(Key));
		auto Ret = base_type::try_emplace(Hash, std::forward<Ts>(Args)...);
		if (!Ret.second) {
		davidxlUnsubmitted Not Done Reply Inline Actions what is the purpose of this wrapper? davidxl: what is the purpose of this wrapper?
		huangjdAuthorUnsubmitted Done Reply Inline Actions existing code compatibility huangjd: existing code compatibility
		original_key_type ExistingKey;
		if (LLVM_UNLIKELY(!CheckKeyMatch(Key, Ret.first->second, &ExistingKey))) {
		dbgs() << "MD5 collision detected: " << Key << " and " << ExistingKey
		<< " has same hash value " << Hash << "\n";
		davidxlUnsubmitted Done Reply Inline Actions This interface is confusing. User expect it to return existing entry, but here it erases it. Should this interface be hidden (and not allowed with assert)? davidxl: This interface is confusing. User expect it to return existing entry, but here it erases it.
		huangjdAuthorUnsubmitted Done Reply Inline Actions In C++, [] is same as try_emplace with default constructed mapped_type, so I am keeping the behavior consistent. Keeping this function because many places use it. huangjd: In C++, [] is same as try_emplace with default constructed mapped_type, so I am keeping the…
		Ret.second = true;
		Ret.first->second = mapped_type(std::forward<Ts>(Args)...);
		}
		}
		return Ret;
		}

		template <typename... Ts>
		davidxlUnsubmitted Done Reply Inline Actions is this always true as currently implemented? davidxl: is this always true as currently implemented?
		huangjdAuthorUnsubmitted Done Reply Inline Actions This returns false if the existing entry actually has the same context, which indicates a match, rather than a MD5 collision, so no need to set the context again (And setting the context actually erase the flags in the context, which is not used for equality comparison or hasing) huangjd: This returns false if the existing entry actually has the same context, which indicates a match…
		std::pair<iterator, bool> try_emplace(const original_key_type &Key,
		Ts &&...Args) {
		key_type Hash = hash_value(Key);
		return try_emplace(Hash, Key, std::forward<Ts>(Args)...);
		}

		template <typename... Ts> std::pair<iterator, bool> emplace(Ts &&...Args) {
		return try_emplace(std::forward<Ts>(Args)...);
		}

		mapped_type &operator[](const original_key_type &Key) {
		return try_emplace(Key, mapped_type()).first->second;
		}

		iterator find(const original_key_type &Key) {
		key_type Hash = hash_value(Key);
		auto It = base_type::find(Hash);
		if (It != base_type::end())
		if (LLVM_LIKELY(CheckKeyMatch(Key, It->second)))
		return It;
		return base_type::end();
		}

		const_iterator find(const original_key_type &Key) const {
		snehasishUnsubmitted Not Done Reply Inline Actions typo "function" snehasish: typo "function"
		key_type Hash = hash_value(Key);
		auto It = base_type::find(Hash);
		if (It != base_type::end())
		if (LLVM_LIKELY(CheckKeyMatch(Key, It->second)))
		return It;
		return base_type::end();
		}

		size_t erase(const original_key_type &Ctx) {
		auto It = find(Ctx);
		if (It != base_type::end()) {
		base_type::erase(It);
		return 1;
		}
		return 0;
		}
		};

		/// This class provides operator overloads to the map container using MD5 as the
		/// key type, so that existing code can still work in most cases using
		/// SampleContext as key.
		/// Note: when populating container, make sure to assign the SampleContext to
		/// the mapped value immediately because the key no longer holds it.
		class SampleProfileMap
		: public HashKeyMap<DenseMap, SampleContext, FunctionSamples> {
		public:
		// Convenience method because this is being used in many places. Set the
		// FunctionSamples' context if its newly inserted.
		mapped_type &Create(const SampleContext &Ctx) {
		auto Ret = try_emplace(Ctx, FunctionSamples());
		if (Ret.second)
		Ret.first->second.setContext(Ctx);
		return Ret.first->second;
		}

		iterator find(const SampleContext &Ctx) {
		return HashKeyMap<llvm::DenseMap, SampleContext, FunctionSamples>::find(
		Ctx);
		}

using NameFunctionSamples = std::pair<SampleContext, const FunctionSamples *>;		const_iterator find(const SampleContext &Ctx) const {
		return HashKeyMap<llvm::DenseMap, SampleContext, FunctionSamples>::find(
		Ctx);
		}

		// Overloaded find() to lookup a function by name. This is called by IPO
		// passes with an actual function name, and it is possible that the profile
		// reader converted function names in the profile to MD5 strings, so we need
		// to check if either representation matches.
		iterator find(StringRef Fname) {
		uint64_t Hash = hashFuncName(Fname);
		auto It = base_type::find(hash_code(Hash));
		if (It != end()) {
		StringRef CtxName = It->second.getContext().getName();
		if (LLVM_LIKELY(CtxName == Fname \|\| CtxName == std::to_string(Hash)))
		return It;
		}
		return end();
		}

		size_t erase(const SampleContext &Ctx) {
		return HashKeyMap<llvm::DenseMap, SampleContext, FunctionSamples>::erase(
		Ctx);
		}

		size_t erase(const key_type &Key) { return base_type::erase(Key); }
		};

		using NameFunctionSamples = std::pair<hash_code, const FunctionSamples *>;

void sortFuncProfiles(const SampleProfileMap &ProfileMap,		void sortFuncProfiles(const SampleProfileMap &ProfileMap,
std::vector<NameFunctionSamples> &SortedProfiles);		std::vector<NameFunctionSamples> &SortedProfiles);

/// Sort a LocationT->SampleT map by LocationT.		/// Sort a LocationT->SampleT map by LocationT.
///		///
/// It produces a sorted list of <LocationT, SampleT> records by ascending		/// It produces a sorted list of <LocationT, SampleT> records by ascending
/// order of LocationT.		/// order of LocationT.
Show All 29 Lines	public:
// mainly to honor the preinliner decsion. Note that when MergeColdContext is		// mainly to honor the preinliner decsion. Note that when MergeColdContext is
// true, preinliner decsion is not honored anyway so TrimBaseProfileOnly will		// true, preinliner decsion is not honored anyway so TrimBaseProfileOnly will
// be ignored.		// be ignored.
void trimAndMergeColdContextProfiles(uint64_t ColdCountThreshold,		void trimAndMergeColdContextProfiles(uint64_t ColdCountThreshold,
bool TrimColdContext,		bool TrimColdContext,
bool MergeColdContext,		bool MergeColdContext,
uint32_t ColdContextFrameLength,		uint32_t ColdContextFrameLength,
bool TrimBaseProfileOnly);		bool TrimBaseProfileOnly);
// Canonicalize context profile name and attributes.
void canonicalizeContextProfiles();

private:		private:
SampleProfileMap &ProfileMap;		SampleProfileMap &ProfileMap;
};		};

/// Helper class for profile conversion.		/// Helper class for profile conversion.
///		///
/// It supports full context-sensitive profile to nested profile conversion,		/// It supports full context-sensitive profile to nested profile conversion,
Show All 29 Lines	static void flattenProfile(SampleProfileMap &ProfileMap,
flattenProfile(ProfileMap, TmpProfiles, ProfileIsCS);		flattenProfile(ProfileMap, TmpProfiles, ProfileIsCS);
ProfileMap = std::move(TmpProfiles);		ProfileMap = std::move(TmpProfiles);
}		}

static void flattenProfile(const SampleProfileMap &InputProfiles,		static void flattenProfile(const SampleProfileMap &InputProfiles,
SampleProfileMap &OutputProfiles,		SampleProfileMap &OutputProfiles,
bool ProfileIsCS = false) {		bool ProfileIsCS = false) {
if (ProfileIsCS) {		if (ProfileIsCS) {
for (const auto &I : InputProfiles)		for (const auto &I : InputProfiles) {
OutputProfiles[I.second.getName()].merge(I.second);
// Retain the profile name and clear the full context for each function		// Retain the profile name and clear the full context for each function
// profile.		// profile.
for (auto &I : OutputProfiles)		FunctionSamples &FS = OutputProfiles.Create(I.second.getName());
I.second.setContext(SampleContext(I.first));		FS.merge(I.second);
		}
} else {		} else {
for (const auto &I : InputProfiles)		for (const auto &I : InputProfiles)
flattenNestedProfile(OutputProfiles, I.second);		flattenNestedProfile(OutputProfiles, I.second);
}		}
}		}

private:		private:
static void flattenNestedProfile(SampleProfileMap &OutputProfiles,		static void flattenNestedProfile(SampleProfileMap &OutputProfiles,
▲ Show 20 Lines • Show All 140 Lines • Show Last 20 Lines

llvm/include/llvm/ProfileData/SampleProfReader.h

Show First 20 Lines • Show All 341 Lines • ▼ Show 20 Lines
///		///
/// The reader supports two file formats: text and binary. The text format		/// The reader supports two file formats: text and binary. The text format
/// is useful for debugging and testing, while the binary format is more		/// is useful for debugging and testing, while the binary format is more
/// compact and I/O efficient. They can both be used interchangeably.		/// compact and I/O efficient. They can both be used interchangeably.
class SampleProfileReader {		class SampleProfileReader {
public:		public:
SampleProfileReader(std::unique_ptr<MemoryBuffer> B, LLVMContext &C,		SampleProfileReader(std::unique_ptr<MemoryBuffer> B, LLVMContext &C,
SampleProfileFormat Format = SPF_None)		SampleProfileFormat Format = SPF_None)
: Profiles(0), Ctx(C), Buffer(std::move(B)), Format(Format) {}		: Profiles(), Ctx(C), Buffer(std::move(B)), Format(Format) {}

virtual ~SampleProfileReader() = default;		virtual ~SampleProfileReader() = default;

/// Read and validate the file header.		/// Read and validate the file header.
virtual std::error_code readHeader() = 0;		virtual std::error_code readHeader() = 0;

/// Set the bits for FS discriminators. Parameter Pass specify the sequence		/// Set the bits for FS discriminators. Parameter Pass specify the sequence
/// number, Pass == i is for the i-th round of adding FS discriminators.		/// number, Pass == i is for the i-th round of adding FS discriminators.
Show All 19 Lines	if (Remapper)
Remapper->applyRemapping(Ctx);		Remapper->applyRemapping(Ctx);
FunctionSamples::UseMD5 = useMD5();		FunctionSamples::UseMD5 = useMD5();
return sampleprof_error::success;		return sampleprof_error::success;
}		}

/// The implementaion to read sample profiles from the associated file.		/// The implementaion to read sample profiles from the associated file.
virtual std::error_code readImpl() = 0;		virtual std::error_code readImpl() = 0;

/// Print the profile for \p FContext on stream \p OS.		/// Print the profile for \p FunctionSamples on stream \p OS.
void dumpFunctionProfile(SampleContext FContext, raw_ostream &OS = dbgs());		void dumpFunctionProfile(const FunctionSamples &FS, raw_ostream &OS = dbgs());

/// Collect functions with definitions in Module M. For reader which		/// Collect functions with definitions in Module M. For reader which
/// support loading function profiles on demand, return true when the		/// support loading function profiles on demand, return true when the
/// reader has been given a module. Always return false for reader		/// reader has been given a module. Always return false for reader
/// which doesn't support loading function profiles on demand.		/// which doesn't support loading function profiles on demand.
virtual bool collectFuncsFromModule() { return false; }		virtual bool collectFuncsFromModule() { return false; }

/// Print all the profiles on stream \p OS.		/// Print all the profiles on stream \p OS.
void dump(raw_ostream &OS = dbgs());		void dump(raw_ostream &OS = dbgs());

/// Print all the profiles on stream \p OS in the JSON format.		/// Print all the profiles on stream \p OS in the JSON format.
void dumpJson(raw_ostream &OS = dbgs());		void dumpJson(raw_ostream &OS = dbgs());

/// Return the samples collected for function \p F.		/// Return the samples collected for function \p F.
FunctionSamples *getSamplesFor(const Function &F) {		FunctionSamples *getSamplesFor(const Function &F) {
// The function name may have been updated by adding suffix. Call		// The function name may have been updated by adding suffix. Call
// a helper to (optionally) strip off suffixes so that we can		// a helper to (optionally) strip off suffixes so that we can
// match against the original function name in the profile.		// match against the original function name in the profile.
StringRef CanonName = FunctionSamples::getCanonicalFnName(F);		StringRef CanonName = FunctionSamples::getCanonicalFnName(F);
return getSamplesFor(CanonName);		return getSamplesFor(CanonName);
}		}

/// Return the samples collected for function \p F.		/// Return the samples collected for function \p F.
		davidxlUnsubmitted Not Done Reply Inline Actions Why not using the new interface SampleProfileMapTryEmplace? davidxl: Why not using the new interface SampleProfileMapTryEmplace?
		huangjdAuthorUnsubmitted Done Reply Inline Actions This is one of the exception case because of the additional processing at line 425. This function intends to succeed even if a collision happens huangjd: This is one of the exception case because of the additional processing at line 425. This…
		davidxlUnsubmitted Not Done Reply Inline Actions You can add an extra parameter : insert_on_conflict (default to false). davidxl: You can add an extra parameter : insert_on_conflict (default to false).
virtual FunctionSamples *getSamplesFor(StringRef Fname) {		FunctionSamples *getSamplesFor(StringRef Fname) {
std::string FGUID;
Fname = getRepInFormat(Fname, useMD5(), FGUID);
auto It = Profiles.find(Fname);		auto It = Profiles.find(Fname);
if (It != Profiles.end())		if (It != Profiles.end())
return &It->second;		return &It->second;

if (Remapper) {		if (Remapper) {
		davidxlUnsubmitted Done Reply Inline Actions have a helper for the common code sequence? davidxl: have a helper for the common code sequence?
if (auto NameInProfile = Remapper->lookUpNameInProfile(Fname)) {		if (auto NameInProfile = Remapper->lookUpNameInProfile(Fname)) {
auto It = Profiles.find(*NameInProfile);		auto It = Profiles.find(*NameInProfile);
if (It != Profiles.end())		if (It != Profiles.end())
return &It->second;		return &It->second;
}		}
}		}
return nullptr;		return nullptr;
}		}
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	protected:
/// LLVM context used to emit diagnostics.		/// LLVM context used to emit diagnostics.
LLVMContext &Ctx;		LLVMContext &Ctx;

/// Memory buffer holding the profile file.		/// Memory buffer holding the profile file.
std::unique_ptr<MemoryBuffer> Buffer;		std::unique_ptr<MemoryBuffer> Buffer;

/// Extra name buffer holding names created on demand.		/// Extra name buffer holding names created on demand.
/// This should only be needed for md5 profiles.		/// This should only be needed for md5 profiles.
std::unordered_set<std::string> MD5NameBuffer;		std::unordered_set<std::string> MD5NameBuffer;
		davidxlUnsubmitted Not Done Reply Inline Actions why changing it to list? davidxl: why changing it to list?
		huangjdAuthorUnsubmitted Done Reply Inline Actions It's faster. The buffer is never directly accessed. It is a backing buffer for StringRefs. huangjd: It's faster. The buffer is never directly accessed. It is a backing buffer for StringRefs.

/// Profile summary information.		/// Profile summary information.
std::unique_ptr<ProfileSummary> Summary;		std::unique_ptr<ProfileSummary> Summary;

/// Take ownership of the summary of this reader.		/// Take ownership of the summary of this reader.
static std::unique_ptr<ProfileSummary>		static std::unique_ptr<ProfileSummary>
takeSummary(SampleProfileReader &Reader) {		takeSummary(SampleProfileReader &Reader) {
return std::move(Reader.Summary);		return std::move(Reader.Summary);
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	protected:
std::error_code readMagicIdent();		std::error_code readMagicIdent();

/// Read profile summary.		/// Read profile summary.
std::error_code readSummary();		std::error_code readSummary();

/// Read the whole name table.		/// Read the whole name table.
std::error_code readNameTable();		std::error_code readNameTable();

/// Read a string indirectly via the name table.		/// Read a string indirectly via the name table. Optionally return the index.
ErrorOr<StringRef> readStringFromTable();		ErrorOr<StringRef> readStringFromTable(size_t *RetIdx = nullptr);
		davidxlUnsubmitted Not Done Reply Inline Actions How effective is MD5 compacting with ULEB128 ? Why not use fixed length rep to save some computation time? davidxl: How effective is MD5 compacting with ULEB128 ? Why not use fixed length rep to save some…
		huangjdAuthorUnsubmitted Done Reply Inline Actions This is from existing design, a profile can store MD5 as fixed length or ULEB128. The implementation contains logic to support both huangjd: This is from existing design, a profile can store MD5 as fixed length or ULEB128. The…

/// Read a context indirectly via the CSNameTable.		/// Read a context indirectly via the CSNameTable. Optionally return the
ErrorOr<SampleContextFrames> readContextFromTable();		/// index.
		ErrorOr<SampleContextFrames> readContextFromTable(size_t *RetIdx = nullptr);

/// Read a context indirectly via the CSNameTable if the profile has context,		/// Read a context indirectly via the CSNameTable if the profile has context,
/// otherwise same as readStringFromTable.		/// otherwise same as readStringFromTable, also return its hash value.
ErrorOr<SampleContext> readSampleContextFromTable();		ErrorOr<std::pair<SampleContext, uint64_t>> readSampleContextFromTable();

/// Points to the current location in the buffer.		/// Points to the current location in the buffer.
const uint8_t *Data = nullptr;		const uint8_t *Data = nullptr;

/// Points to the end of the buffer.		/// Points to the end of the buffer.
const uint8_t *End = nullptr;		const uint8_t *End = nullptr;

/// Function name table.		/// Function name table.
std::vector<StringRef> NameTable;		std::vector<StringRef> NameTable;

/// If MD5 is used in NameTable section, the section saves uint64_t data.		/// If MD5 is used in NameTable section, the section saves uint64_t data.
/// The uint64_t data has to be converted to a string and then the string		/// The uint64_t data has to be converted to a string and then the string
/// will be used to initialize StringRef in NameTable.		/// will be used to initialize StringRef in NameTable.
/// Note NameTable contains StringRef so it needs another buffer to own		/// Note NameTable contains StringRef so it needs another buffer to own
/// the string data. MD5StringBuf serves as the string buffer that is		/// the string data. MD5StringBuf serves as the string buffer that is
/// referenced by NameTable (vector of StringRef). We make sure		/// referenced by NameTable (vector of StringRef). We make sure
/// the lifetime of MD5StringBuf is not shorter than that of NameTable.		/// the lifetime of MD5StringBuf is not shorter than that of NameTable.
std::vector<std::string> MD5StringBuf;		std::vector<std::string> MD5StringBuf;

/// The starting address of NameTable containing fixed length MD5.		/// The starting address of fixed length MD5 name table section.
const uint8_t *MD5NameMemStart = nullptr;		const uint8_t *MD5NameMemStart = nullptr;

/// CSNameTable is used to save full context vectors. It is the backing buffer		/// CSNameTable is used to save full context vectors. It is the backing buffer
/// for SampleContextFrames.		/// for SampleContextFrames.
std::vector<SampleContextFrameVector> CSNameTable;		std::vector<SampleContextFrameVector> CSNameTable;

		/// Table to cache MD5 values of sample contexts corresponding to
		/// readSampleContextFromTable(), used to index into Profiles or
		/// FuncOffsetTable.
		std::vector<uint64_t> MD5SampleContextTable;

		/// The starting address of the table of MD5 values of sample contexts. For
		/// fixed length MD5 non-CS profile it is same as MD5NameMemStart because
		/// hashes of non-CS contexts are already in the profile. Otherwise it points
		/// to the start of MD5SampleContextTable.
		const uint64_t *MD5SampleContextStart = nullptr;

private:		private:
std::error_code readSummaryEntry(std::vector<ProfileSummaryEntry> &Entries);		std::error_code readSummaryEntry(std::vector<ProfileSummaryEntry> &Entries);
virtual std::error_code verifySPMagic(uint64_t Magic) = 0;		virtual std::error_code verifySPMagic(uint64_t Magic) = 0;
};		};

class SampleProfileReaderRawBinary : public SampleProfileReaderBinary {		class SampleProfileReaderRawBinary : public SampleProfileReaderBinary {
private:		private:
std::error_code verifySPMagic(uint64_t Magic) override;		std::error_code verifySPMagic(uint64_t Magic) override;
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	protected:
virtual std::error_code readCustomSection(const SecHdrTableEntry &Entry) = 0;		virtual std::error_code readCustomSection(const SecHdrTableEntry &Entry) = 0;

/// Determine which container readFuncOffsetTable() should populate, the list		/// Determine which container readFuncOffsetTable() should populate, the list
/// FuncOffsetList or the map FuncOffsetTable.		/// FuncOffsetList or the map FuncOffsetTable.
bool useFuncOffsetList() const;		bool useFuncOffsetList() const;

std::unique_ptr<ProfileSymbolList> ProfSymList;		std::unique_ptr<ProfileSymbolList> ProfSymList;

/// The table mapping from function context to the offset of its		/// The table mapping from a function context's MD5 to the offset of its
/// FunctionSample towards file start.		/// FunctionSample towards file start.
/// At most one of FuncOffsetTable and FuncOffsetList is populated.		/// At most one of FuncOffsetTable and FuncOffsetList is populated.
DenseMap<SampleContext, uint64_t> FuncOffsetTable;		DenseMap<hash_code, uint64_t> FuncOffsetTable;

/// The list version of FuncOffsetTable. This is used if every entry is		/// The list version of FuncOffsetTable. This is used if every entry is
/// being accessed.		/// being accessed.
std::vector<std::pair<SampleContext, uint64_t>> FuncOffsetList;		std::vector<std::pair<SampleContext, uint64_t>> FuncOffsetList;

/// The set containing the functions to use when compiling a module.		/// The set containing the functions to use when compiling a module.
DenseSet<StringRef> FuncsToUse;		DenseSet<StringRef> FuncsToUse;

▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

llvm/lib/ProfileData/ProfileSummaryBuilder.cpp

Show First 20 Lines • Show All 198 Lines • ▼ Show 20 Lines	SampleProfileSummaryBuilder::computeSummaryForProfiles(
// For CSSPGO, context-sensitive profile effectively split a function profile		// For CSSPGO, context-sensitive profile effectively split a function profile
// into many copies each representing the CFG profile of a particular calling		// into many copies each representing the CFG profile of a particular calling
// context. That makes the count distribution looks more flat as we now have		// context. That makes the count distribution looks more flat as we now have
// more function profiles each with lower counts, which in turn leads to lower		// more function profiles each with lower counts, which in turn leads to lower
// hot thresholds. To compensate for that, by default we merge context		// hot thresholds. To compensate for that, by default we merge context
// profiles before computing profile summary.		// profiles before computing profile summary.
if (UseContextLessSummary \|\| (sampleprof::FunctionSamples::ProfileIsCS &&		if (UseContextLessSummary \|\| (sampleprof::FunctionSamples::ProfileIsCS &&
!UseContextLessSummary.getNumOccurrences())) {		!UseContextLessSummary.getNumOccurrences())) {
for (const auto &I : Profiles) {		ProfileConverter::flattenProfile(Profiles, ContextLessProfiles, true);
		snehasishUnsubmitted Not Done Reply Inline Actions Document the parameter name /ProfileIsCS=/? snehasish: Document the parameter name /ProfileIsCS=/?
ContextLessProfiles[I.second.getName()].merge(I.second);
}
ProfilesToUse = &ContextLessProfiles;		ProfilesToUse = &ContextLessProfiles;
}		}

for (const auto &I : *ProfilesToUse) {		for (const auto &I : *ProfilesToUse) {
const sampleprof::FunctionSamples &Profile = I.second;		const sampleprof::FunctionSamples &Profile = I.second;
addRecord(Profile);		addRecord(Profile);
}		}

Show All 26 Lines

llvm/lib/ProfileData/SampleProf.cpp

Show First 20 Lines • Show All 196 Lines • ▼ Show 20 Lines	raw_ostream &llvm::sampleprof::operator<<(raw_ostream &OS,
FS.print(OS);		FS.print(OS);
return OS;		return OS;
}		}

void sampleprof::sortFuncProfiles(		void sampleprof::sortFuncProfiles(
const SampleProfileMap &ProfileMap,		const SampleProfileMap &ProfileMap,
std::vector<NameFunctionSamples> &SortedProfiles) {		std::vector<NameFunctionSamples> &SortedProfiles) {
for (const auto &I : ProfileMap) {		for (const auto &I : ProfileMap) {
assert(I.first == I.second.getContext() && "Inconsistent profile map");		SortedProfiles.push_back(std::make_pair(I.first, &I.second));
		snehasishUnsubmitted Not Done Reply Inline Actions nit: drop the make_pair in favour or an initializer list? snehasish: nit: drop the make_pair in favour or an initializer list?
SortedProfiles.push_back(std::make_pair(I.second.getContext(), &I.second));
}		}
llvm::stable_sort(SortedProfiles, [](const NameFunctionSamples &A,		llvm::stable_sort(SortedProfiles, [](const NameFunctionSamples &A,
const NameFunctionSamples &B) {		const NameFunctionSamples &B) {
if (A.second->getTotalSamples() == B.second->getTotalSamples())		if (A.second->getTotalSamples() == B.second->getTotalSamples())
return A.first < B.first;		return A.second->getContext() < B.second->getContext();
return A.second->getTotalSamples() > B.second->getTotalSamples();		return A.second->getTotalSamples() > B.second->getTotalSamples();
});		});
}		}

unsigned FunctionSamples::getOffset(const DILocation *DIL) {		unsigned FunctionSamples::getOffset(const DILocation *DIL) {
return (DIL->getLine() - DIL->getScope()->getSubprogram()->getLine()) &		return (DIL->getLine() - DIL->getScope()->getSubprogram()->getLine()) &
0xffff;		0xffff;
}		}
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	void SampleContextTrimmer::trimAndMergeColdContextProfiles(
// Trimming base profiles only is mainly to honor the preinliner decsion. When		// Trimming base profiles only is mainly to honor the preinliner decsion. When
// MergeColdContext is true preinliner decsion is not honored anyway so turn		// MergeColdContext is true preinliner decsion is not honored anyway so turn
// off TrimBaseProfileOnly.		// off TrimBaseProfileOnly.
if (MergeColdContext)		if (MergeColdContext)
TrimBaseProfileOnly = false;		TrimBaseProfileOnly = false;

// Filter the cold profiles from ProfileMap and move them into a tmp		// Filter the cold profiles from ProfileMap and move them into a tmp
// container		// container
std::vector<std::pair<SampleContext, const FunctionSamples *>> ColdProfiles;		std::vector<std::pair<hash_code, const FunctionSamples *>> ColdProfiles;
for (const auto &I : ProfileMap) {		for (const auto &I : ProfileMap) {
const SampleContext &Context = I.first;		const SampleContext &Context = I.second.getContext();
const FunctionSamples &FunctionProfile = I.second;		const FunctionSamples &FunctionProfile = I.second;
if (FunctionProfile.getTotalSamples() < ColdCountThreshold &&		if (FunctionProfile.getTotalSamples() < ColdCountThreshold &&
(!TrimBaseProfileOnly \|\| Context.isBaseContext()))		(!TrimBaseProfileOnly \|\| Context.isBaseContext()))
ColdProfiles.emplace_back(Context, &I.second);		ColdProfiles.emplace_back(I.first, &I.second);
}		}

// Remove the cold profile from ProfileMap and merge them into		// Remove the cold profile from ProfileMap and merge them into
// MergedProfileMap by the last K frames of context		// MergedProfileMap by the last K frames of context
SampleProfileMap MergedProfileMap;		SampleProfileMap MergedProfileMap;
for (const auto &I : ColdProfiles) {		for (const auto &I : ColdProfiles) {
if (MergeColdContext) {		if (MergeColdContext) {
auto MergedContext = I.second->getContext().getContextFrames();		auto MergedContext = I.second->getContext().getContextFrames();
if (ColdContextFrameLength < MergedContext.size())		if (ColdContextFrameLength < MergedContext.size())
MergedContext = MergedContext.take_back(ColdContextFrameLength);		MergedContext = MergedContext.take_back(ColdContextFrameLength);
auto Ret = MergedProfileMap.emplace(MergedContext, FunctionSamples());		// Need to set MergedProfile's context here otherwise it will be lost.
FunctionSamples &MergedProfile = Ret.first->second;		FunctionSamples &MergedProfile = MergedProfileMap.Create(MergedContext);
MergedProfile.merge(*I.second);		MergedProfile.merge(*I.second);
}		}
ProfileMap.erase(I.first);		ProfileMap.erase(I.first);
}		}

// Move the merged profiles into ProfileMap;		// Move the merged profiles into ProfileMap;
for (const auto &I : MergedProfileMap) {		for (const auto &I : MergedProfileMap) {
// Filter the cold merged profile		// Filter the cold merged profile
if (TrimColdContext && I.second.getTotalSamples() < ColdCountThreshold &&		if (TrimColdContext && I.second.getTotalSamples() < ColdCountThreshold &&
ProfileMap.find(I.first) == ProfileMap.end())		ProfileMap.find(I.second.getContext()) == ProfileMap.end())
continue;		continue;
// Merge the profile if the original profile exists, otherwise just insert		// Merge the profile if the original profile exists, otherwise just insert
// as a new profile		// as a new profile. If inserted as a new profile from MergedProfileMap, it
auto Ret = ProfileMap.emplace(I.first, FunctionSamples());		// already has the right context.
if (Ret.second) {		auto Ret = ProfileMap.emplace(I.second.getContext(), FunctionSamples());
SampleContext FContext(Ret.first->first, RawContext);
FunctionSamples &FProfile = Ret.first->second;
FProfile.setContext(FContext);
}
FunctionSamples &OrigProfile = Ret.first->second;		FunctionSamples &OrigProfile = Ret.first->second;
OrigProfile.merge(I.second);		OrigProfile.merge(I.second);
}		}
}		}

void SampleContextTrimmer::canonicalizeContextProfiles() {
davidxlUnsubmitted Not Done Reply Inline Actions is this a dead function? should it be removed in a separate patch? davidxl: is this a dead function? should it be removed in a separate patch?
huangjdAuthorUnsubmitted Done Reply Inline Actions After refactoring, the invariant key == value.getContext() is moot, so this function is dead huangjd: After refactoring, the invariant key == value.getContext() is moot, so this function is dead
std::vector<SampleContext> ProfilesToBeRemoved;
SampleProfileMap ProfilesToBeAdded;
for (auto &I : ProfileMap) {
FunctionSamples &FProfile = I.second;
SampleContext &Context = FProfile.getContext();
if (I.first == Context)
continue;

// Use the context string from FunctionSamples to update the keys of
// ProfileMap. They can get out of sync after context profile promotion
// through pre-inliner.
// Duplicate the function profile for later insertion to avoid a conflict
// caused by a context both to be add and to be removed. This could happen
// when a context is promoted to another context which is also promoted to
// the third context. For example, given an original context A @ B @ C that
// is promoted to B @ C and the original context B @ C which is promoted to
// just C, adding B @ C to the profile map while removing same context (but
// with different profiles) from the map can cause a conflict if they are
// not handled in a right order. This can be solved by just caching the
// profiles to be added.
auto Ret = ProfilesToBeAdded.emplace(Context, FProfile);
(void)Ret;
assert(Ret.second && "Context conflict during canonicalization");
ProfilesToBeRemoved.push_back(I.first);
}

for (auto &I : ProfilesToBeRemoved) {
ProfileMap.erase(I);
}

for (auto &I : ProfilesToBeAdded) {
ProfileMap.emplace(I.first, I.second);
}
}

std::error_code ProfileSymbolList::write(raw_ostream &OS) {		std::error_code ProfileSymbolList::write(raw_ostream &OS) {
// Sort the symbols before output. If doing compression.		// Sort the symbols before output. If doing compression.
// It will make the compression much more effective.		// It will make the compression much more effective.
std::vector<StringRef> SortedList(Syms.begin(), Syms.end());		std::vector<StringRef> SortedList(Syms.begin(), Syms.end());
llvm::sort(SortedList);		llvm::sort(SortedList);

std::string OutputString;		std::string OutputString;
for (auto &Sym : SortedList) {		for (auto &Sym : SortedList) {
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	void ProfileConverter::convertCSProfiles(ProfileConverter::FrameNode &Node) {
auto *NodeProfile = Node.FuncSamples;		auto *NodeProfile = Node.FuncSamples;
for (auto &It : Node.AllChildFrames) {		for (auto &It : Node.AllChildFrames) {
auto &ChildNode = It.second;		auto &ChildNode = It.second;
convertCSProfiles(ChildNode);		convertCSProfiles(ChildNode);
auto *ChildProfile = ChildNode.FuncSamples;		auto *ChildProfile = ChildNode.FuncSamples;
if (!ChildProfile)		if (!ChildProfile)
continue;		continue;
SampleContext OrigChildContext = ChildProfile->getContext();		SampleContext OrigChildContext = ChildProfile->getContext();
		hash_code OrigChildContextHash = OrigChildContext.getHashCode();
// Reset the child context to be contextless.		// Reset the child context to be contextless.
ChildProfile->getContext().setName(OrigChildContext.getName());		ChildProfile->getContext().setName(OrigChildContext.getName());
if (NodeProfile) {		if (NodeProfile) {
// Add child profile to the callsite profile map.		// Add child profile to the callsite profile map.
auto &SamplesMap = NodeProfile->functionSamplesAt(ChildNode.CallSiteLoc);		auto &SamplesMap = NodeProfile->functionSamplesAt(ChildNode.CallSiteLoc);
SamplesMap.emplace(OrigChildContext.getName().str(), *ChildProfile);		SamplesMap.emplace(OrigChildContext.getName().str(), *ChildProfile);
NodeProfile->addTotalSamples(ChildProfile->getTotalSamples());		NodeProfile->addTotalSamples(ChildProfile->getTotalSamples());
// Remove the corresponding body sample for the callsite and update the		// Remove the corresponding body sample for the callsite and update the
// total weight.		// total weight.
auto Count = NodeProfile->removeCalledTargetAndBodySample(		auto Count = NodeProfile->removeCalledTargetAndBodySample(
ChildNode.CallSiteLoc.LineOffset, ChildNode.CallSiteLoc.Discriminator,		ChildNode.CallSiteLoc.LineOffset, ChildNode.CallSiteLoc.Discriminator,
OrigChildContext.getName());		OrigChildContext.getName());
NodeProfile->removeTotalSamples(Count);		NodeProfile->removeTotalSamples(Count);
}		}

		hash_code NewChildProfileHash(0);
// Separate child profile to be a standalone profile, if the current parent		// Separate child profile to be a standalone profile, if the current parent
// profile doesn't exist. This is a duplicating operation when the child		// profile doesn't exist. This is a duplicating operation when the child
// profile is already incorporated into the parent which is still useful and		// profile is already incorporated into the parent which is still useful and
// thus done optionally. It is seen that duplicating context profiles into		// thus done optionally. It is seen that duplicating context profiles into
// base profiles improves the code quality for thinlto build by allowing a		// base profiles improves the code quality for thinlto build by allowing a
// profile in the prelink phase for to-be-fully-inlined functions.		// profile in the prelink phase for to-be-fully-inlined functions.
if (!NodeProfile) {		if (!NodeProfile) {
ProfileMap[ChildProfile->getContext()].merge(*ChildProfile);		ProfileMap[ChildProfile->getContext()].merge(*ChildProfile);
		NewChildProfileHash = ChildProfile->getContext().getHashCode();
} else if (GenerateMergedBaseProfiles) {		} else if (GenerateMergedBaseProfiles) {
ProfileMap[ChildProfile->getContext()].merge(*ChildProfile);		ProfileMap[ChildProfile->getContext()].merge(*ChildProfile);
		NewChildProfileHash = ChildProfile->getContext().getHashCode();
auto &SamplesMap = NodeProfile->functionSamplesAt(ChildNode.CallSiteLoc);		auto &SamplesMap = NodeProfile->functionSamplesAt(ChildNode.CallSiteLoc);
SamplesMap[ChildProfile->getName().str()].getContext().setAttribute(		SamplesMap[ChildProfile->getName().str()].getContext().setAttribute(
ContextDuplicatedIntoBase);		ContextDuplicatedIntoBase);
}		}

// Remove the original child profile.		// Remove the original child profile. Check if MD5 of new child profile
ProfileMap.erase(OrigChildContext);		// collides with old profile, in this case the [] operator already
		// overwritten it without the need of erase.
		if (NewChildProfileHash != OrigChildContextHash)
		ProfileMap.erase(OrigChildContextHash);
}		}
}		}

void ProfileConverter::convertCSProfiles() { convertCSProfiles(RootFrame); }		void ProfileConverter::convertCSProfiles() { convertCSProfiles(RootFrame); }

llvm/lib/ProfileData/SampleProfReader.cpp

Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines
static cl::opt<bool> ProfileIsFSDisciminator(		static cl::opt<bool> ProfileIsFSDisciminator(
"profile-isfs", cl::Hidden, cl::init(false),		"profile-isfs", cl::Hidden, cl::init(false),
cl::desc("Profile uses flow sensitive discriminators"));		cl::desc("Profile uses flow sensitive discriminators"));

/// Dump the function profile for \p FName.		/// Dump the function profile for \p FName.
///		///
/// \param FContext Name + context of the function to print.		/// \param FContext Name + context of the function to print.
/// \param OS Stream to emit the output to.		/// \param OS Stream to emit the output to.
void SampleProfileReader::dumpFunctionProfile(SampleContext FContext,		void SampleProfileReader::dumpFunctionProfile(const FunctionSamples &FS,
raw_ostream &OS) {		raw_ostream &OS) {
OS << "Function: " << FContext.toString() << ": " << Profiles[FContext];		OS << "Function: " << FS.getContext().toString() << ": " << FS;
}		}

/// Dump all the function profiles found on stream \p OS.		/// Dump all the function profiles found on stream \p OS.
void SampleProfileReader::dump(raw_ostream &OS) {		void SampleProfileReader::dump(raw_ostream &OS) {
std::vector<NameFunctionSamples> V;		std::vector<NameFunctionSamples> V;
sortFuncProfiles(Profiles, V);		sortFuncProfiles(Profiles, V);
for (const auto &I : V)		for (const auto &I : V)
dumpFunctionProfile(I.first, OS);		dumpFunctionProfile(*I.second, OS);
}		}

static void dumpFunctionProfileJson(const FunctionSamples &S,		static void dumpFunctionProfileJson(const FunctionSamples &S,
json::OStream &JOS, bool TopLevel = false) {		json::OStream &JOS, bool TopLevel = false) {
auto DumpBody = [&](const BodySampleMap &BodySamples) {		auto DumpBody = [&](const BodySampleMap &BodySamples) {
for (const auto &I : BodySamples) {		for (const auto &I : BodySamples) {
const LineLocation &Loc = I.first;		const LineLocation &Loc = I.first;
const SampleRecord &Sample = I.second;		const SampleRecord &Sample = I.second;
▲ Show 20 Lines • Show All 267 Lines • ▼ Show 20 Lines	if ((*LineIt)[0] != ' ') {
reportError(LineIt.line_number(),		reportError(LineIt.line_number(),
"Expected 'mangled_name:NUM:NUM', found " + *LineIt);		"Expected 'mangled_name:NUM:NUM', found " + *LineIt);
return sampleprof_error::malformed;		return sampleprof_error::malformed;
}		}
DepthMetadata = 0;		DepthMetadata = 0;
SampleContext FContext(FName, CSNameTable);		SampleContext FContext(FName, CSNameTable);
if (FContext.hasContext())		if (FContext.hasContext())
++CSProfileCount;		++CSProfileCount;
Profiles[FContext] = FunctionSamples();		FunctionSamples &FProfile = Profiles.Create(FContext);
FunctionSamples &FProfile = Profiles[FContext];
FProfile.setContext(FContext);
MergeResult(Result, FProfile.addTotalSamples(NumSamples));		MergeResult(Result, FProfile.addTotalSamples(NumSamples));
MergeResult(Result, FProfile.addHeadSamples(NumHeadSamples));		MergeResult(Result, FProfile.addHeadSamples(NumHeadSamples));
InlineStack.clear();		InlineStack.clear();
InlineStack.push_back(&FProfile);		InlineStack.push_back(&FProfile);
} else {		} else {
uint64_t NumSamples;		uint64_t NumSamples;
StringRef FName;		StringRef FName;
DenseMap<StringRef, uint64_t> TargetCountMap;		DenseMap<StringRef, uint64_t> TargetCountMap;
▲ Show 20 Lines • Show All 151 Lines • ▼ Show 20 Lines	inline ErrorOr<size_t> SampleProfileReaderBinary::readStringIndex(T &Table) {
auto Idx = readNumber<size_t>();		auto Idx = readNumber<size_t>();
if (std::error_code EC = Idx.getError())		if (std::error_code EC = Idx.getError())
return EC;		return EC;
if (*Idx >= Table.size())		if (*Idx >= Table.size())
return sampleprof_error::truncated_name_table;		return sampleprof_error::truncated_name_table;
return *Idx;		return *Idx;
}		}

ErrorOr<StringRef> SampleProfileReaderBinary::readStringFromTable() {		ErrorOr<StringRef>
		SampleProfileReaderBinary::readStringFromTable(size_t *RetIdx) {
		snehasishUnsubmitted Done Reply Inline Actions Can we use a more descriptive name for the output parameters (here and elsewhere)? snehasish: Can we use a more descriptive name for the output parameters (here and elsewhere)?
auto Idx = readStringIndex(NameTable);		auto Idx = readStringIndex(NameTable);
if (std::error_code EC = Idx.getError())		if (std::error_code EC = Idx.getError())
return EC;		return EC;

// Lazy loading, if the string has not been materialized from memory storing		// Lazy loading, if the string has not been materialized from memory storing
// MD5 values, then it is default initialized with the null pointer. This can		// MD5 values, then it is default initialized with the null pointer. This can
// only happen when using fixed length MD5, that bounds check is performed		// only happen when using fixed length MD5, that bounds check is performed
// while parsing the name table to ensure MD5NameMemStart points to an array		// while parsing the name table to ensure MD5NameMemStart points to an array
// with enough MD5 entries.		// with enough MD5 entries.
StringRef &SR = NameTable[*Idx];		StringRef &SR = NameTable[*Idx];
if (!SR.data()) {		if (!SR.data()) {
assert(MD5NameMemStart);		assert(MD5NameMemStart);
using namespace support;		using namespace support;
uint64_t FID = endian::read<uint64_t, little, unaligned>(		uint64_t FID = endian::read<uint64_t, little, unaligned>(
MD5NameMemStart + (Idx) sizeof(uint64_t));		MD5NameMemStart + (Idx) sizeof(uint64_t));
SR = MD5StringBuf.emplace_back(std::to_string(FID));		SR = MD5StringBuf.emplace_back(std::to_string(FID));
}		}
		if (RetIdx)
		RetIdx = Idx;
return SR;		return SR;
}		}

ErrorOr<SampleContextFrames> SampleProfileReaderBinary::readContextFromTable() {		ErrorOr<SampleContextFrames>
		SampleProfileReaderBinary::readContextFromTable(size_t *RetIdx) {
auto ContextIdx = readNumber<size_t>();		auto ContextIdx = readNumber<size_t>();
if (std::error_code EC = ContextIdx.getError())		if (std::error_code EC = ContextIdx.getError())
return EC;		return EC;
if (*ContextIdx >= CSNameTable.size())		if (*ContextIdx >= CSNameTable.size())
return sampleprof_error::truncated_name_table;		return sampleprof_error::truncated_name_table;
		if (RetIdx)
		RetIdx = ContextIdx;
return CSNameTable[*ContextIdx];		return CSNameTable[*ContextIdx];
}		}

ErrorOr<SampleContext> SampleProfileReaderBinary::readSampleContextFromTable() {		ErrorOr<std::pair<SampleContext, uint64_t>>
		SampleProfileReaderBinary::readSampleContextFromTable() {
		SampleContext Context;
		size_t Idx;
if (ProfileIsCS) {		if (ProfileIsCS) {
auto FContext(readContextFromTable());		auto FContext(readContextFromTable(&Idx));
if (std::error_code EC = FContext.getError())		if (std::error_code EC = FContext.getError())
return EC;		return EC;
return SampleContext(*FContext);		Context = SampleContext(*FContext);
} else {		} else {
auto FName(readStringFromTable());		auto FName(readStringFromTable(&Idx));
if (std::error_code EC = FName.getError())		if (std::error_code EC = FName.getError())
return EC;		return EC;
return SampleContext(*FName);		Context = SampleContext(*FName);
}		}
		// Since MD5SampleContextStart may point to the profile's file data, need to
		// make sure it is reading the same value on big endian CPU.
		uint64_t Hash = support::endian::read64le(MD5SampleContextStart + Idx);
		// Lazy computing of hash value, write back to the table to cache it. Only
		// compute the context's hash value if it is being referenced for the first
		// time.
		if (Hash == 0) {
		assert(MD5SampleContextStart == MD5SampleContextTable.data());
		Hash = Context.getHashCode();
		support::endian::write64le(&MD5SampleContextTable[Idx], Hash);
		}
		return std::make_pair(Context, Hash);
}		}

std::error_code		std::error_code
		davidxlUnsubmitted Done Reply Inline Actions add comment explaining the benefit of lazy hash computing. davidxl: add comment explaining the benefit of lazy hash computing.
SampleProfileReaderBinary::readProfile(FunctionSamples &FProfile) {		SampleProfileReaderBinary::readProfile(FunctionSamples &FProfile) {
auto NumSamples = readNumber<uint64_t>();		auto NumSamples = readNumber<uint64_t>();
if (std::error_code EC = NumSamples.getError())		if (std::error_code EC = NumSamples.getError())
return EC;		return EC;
FProfile.addTotalSamples(*NumSamples);		FProfile.addTotalSamples(*NumSamples);

// Read the samples in the body.		// Read the samples in the body.
auto NumRecords = readNumber<uint32_t>();		auto NumRecords = readNumber<uint32_t>();
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines

std::error_code		std::error_code
SampleProfileReaderBinary::readFuncProfile(const uint8_t *Start) {		SampleProfileReaderBinary::readFuncProfile(const uint8_t *Start) {
Data = Start;		Data = Start;
auto NumHeadSamples = readNumber<uint64_t>();		auto NumHeadSamples = readNumber<uint64_t>();
if (std::error_code EC = NumHeadSamples.getError())		if (std::error_code EC = NumHeadSamples.getError())
return EC;		return EC;

ErrorOr<SampleContext> FContext(readSampleContextFromTable());		auto FContextHash(readSampleContextFromTable());
if (std::error_code EC = FContext.getError())		if (std::error_code EC = FContextHash.getError())
return EC;		return EC;

Profiles[*FContext] = FunctionSamples();		auto &[FContext, Hash] = *FContextHash;
FunctionSamples &FProfile = Profiles[*FContext];		// Use the cached hash value for insertion instead of recalculating it.
FProfile.setContext(*FContext);		auto Res = Profiles.try_emplace(Hash, FContext, FunctionSamples());
		FunctionSamples &FProfile = Res.first->second;
		FProfile.setContext(FContext);
FProfile.addHeadSamples(*NumHeadSamples);		FProfile.addHeadSamples(*NumHeadSamples);
		wleiUnsubmitted Done Reply Inline Actions Here it checked hash collision during reading the context, it also could happen during writing time(profile generation time or merging profile time..). wlei: Here it checked hash collision during reading the context, it also could happen during writing…

if (FContext->hasContext())		if (FContext.hasContext())
CSProfileCount++;		CSProfileCount++;

if (std::error_code EC = readProfile(FProfile))		if (std::error_code EC = readProfile(FProfile))
return EC;		return EC;
return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code SampleProfileReaderBinary::readImpl() {		std::error_code SampleProfileReaderBinary::readImpl() {
▲ Show 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	std::error_code SampleProfileReaderExtBinaryBase::readFuncOffsetTable() {

bool UseFuncOffsetList = useFuncOffsetList();		bool UseFuncOffsetList = useFuncOffsetList();
if (UseFuncOffsetList)		if (UseFuncOffsetList)
FuncOffsetList.reserve(*Size);		FuncOffsetList.reserve(*Size);
else		else
FuncOffsetTable.reserve(*Size);		FuncOffsetTable.reserve(*Size);

for (uint64_t I = 0; I < *Size; ++I) {		for (uint64_t I = 0; I < *Size; ++I) {
auto FContext(readSampleContextFromTable());		auto FContextHash(readSampleContextFromTable());
if (std::error_code EC = FContext.getError())		if (std::error_code EC = FContextHash.getError())
return EC;		return EC;

		auto &[FContext, Hash] = *FContextHash;
auto Offset = readNumber<uint64_t>();		auto Offset = readNumber<uint64_t>();
if (std::error_code EC = Offset.getError())		if (std::error_code EC = Offset.getError())
return EC;		return EC;

if (UseFuncOffsetList)		if (UseFuncOffsetList)
FuncOffsetList.emplace_back(FContext, Offset);		FuncOffsetList.emplace_back(FContext, *Offset);
else		else
FuncOffsetTable[FContext] = Offset;		// Because Porfiles replace existing value with new value if collision
		// happens, we also use the latest offset so that they are consistent.
		FuncOffsetTable[Hash] = *Offset;
}		}

return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code SampleProfileReaderExtBinaryBase::readFuncProfiles() {		std::error_code SampleProfileReaderExtBinaryBase::readFuncProfiles() {
// Collect functions used by current module if the Reader has been		// Collect functions used by current module if the Reader has been
// given a module.		// given a module.
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	if (ProfileIsCS) {
const uint8_t *FuncProfileAddr = Start + NameOffset.second;		const uint8_t *FuncProfileAddr = Start + NameOffset.second;
if (std::error_code EC = readFuncProfile(FuncProfileAddr))		if (std::error_code EC = readFuncProfile(FuncProfileAddr))
return EC;		return EC;
}		}
}		}
} else if (useMD5()) {		} else if (useMD5()) {
assert(!useFuncOffsetList());		assert(!useFuncOffsetList());
for (auto Name : FuncsToUse) {		for (auto Name : FuncsToUse) {
auto GUID = std::to_string(MD5Hash(Name));		auto GUID = MD5Hash(Name);
auto iter = FuncOffsetTable.find(StringRef(GUID));		auto iter = FuncOffsetTable.find(GUID);
if (iter == FuncOffsetTable.end())		if (iter == FuncOffsetTable.end())
continue;		continue;
const uint8_t *FuncProfileAddr = Start + iter->second;		const uint8_t *FuncProfileAddr = Start + iter->second;
if (std::error_code EC = readFuncProfile(FuncProfileAddr))		if (std::error_code EC = readFuncProfile(FuncProfileAddr))
return EC;		return EC;
}		}
} else if (Remapper) {		} else if (Remapper) {
assert(useFuncOffsetList());		assert(useFuncOffsetList());
for (auto NameOffset : FuncOffsetList) {		for (auto NameOffset : FuncOffsetList) {
SampleContext FContext(NameOffset.first);		SampleContext FContext(NameOffset.first);
auto FuncName = FContext.getName();		auto FuncName = FContext.getName();
if (!FuncsToUse.count(FuncName) && !Remapper->exist(FuncName))		if (!FuncsToUse.count(FuncName) && !Remapper->exist(FuncName))
continue;		continue;
const uint8_t *FuncProfileAddr = Start + NameOffset.second;		const uint8_t *FuncProfileAddr = Start + NameOffset.second;
if (std::error_code EC = readFuncProfile(FuncProfileAddr))		if (std::error_code EC = readFuncProfile(FuncProfileAddr))
return EC;		return EC;
}		}
} else {		} else {
assert(!useFuncOffsetList());		assert(!useFuncOffsetList());
for (auto Name : FuncsToUse) {		for (auto Name : FuncsToUse) {
auto iter = FuncOffsetTable.find(Name);		auto iter = FuncOffsetTable.find(MD5Hash(Name));
if (iter == FuncOffsetTable.end())		if (iter == FuncOffsetTable.end())
continue;		continue;
const uint8_t *FuncProfileAddr = Start + iter->second;		const uint8_t *FuncProfileAddr = Start + iter->second;
if (std::error_code EC = readFuncProfile(FuncProfileAddr))		if (std::error_code EC = readFuncProfile(FuncProfileAddr))
return EC;		return EC;
}		}
}		}
Data = End;		Data = End;
▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	std::error_code SampleProfileReaderBinary::readNameTable() {
// tables mixing string and MD5, all of them have to be normalized to use MD5,		// tables mixing string and MD5, all of them have to be normalized to use MD5,
// because optimization passes can only handle either type.		// because optimization passes can only handle either type.
bool UseMD5 = useMD5();		bool UseMD5 = useMD5();
if (UseMD5)		if (UseMD5)
MD5StringBuf.reserve(MD5StringBuf.size() + *Size);		MD5StringBuf.reserve(MD5StringBuf.size() + *Size);

NameTable.clear();		NameTable.clear();
NameTable.reserve(*Size);		NameTable.reserve(*Size);
		if (!ProfileIsCS) {
		MD5SampleContextTable.clear();
		if (UseMD5)
		MD5SampleContextTable.reserve(*Size);
		else
		// If we are using strings, delay MD5 computation since only a portion of
		// names are used by top level functions. Use 0 to indicate MD5 value is
		// to be calculated as no known string has a MD5 value of 0.
		MD5SampleContextTable.resize(*Size);
		}
for (size_t I = 0; I < *Size; ++I) {		for (size_t I = 0; I < *Size; ++I) {
auto Name(readString());		auto Name(readString());
if (std::error_code EC = Name.getError())		if (std::error_code EC = Name.getError())
return EC;		return EC;
if (UseMD5) {		if (UseMD5) {
uint64_t FID = MD5Hash(*Name);		uint64_t FID = hashFuncName(*Name);
		if (!ProfileIsCS)
		MD5SampleContextTable.emplace_back(FID);
NameTable.emplace_back(MD5StringBuf.emplace_back(std::to_string(FID)));		NameTable.emplace_back(MD5StringBuf.emplace_back(std::to_string(FID)));
} else		} else
NameTable.push_back(*Name);		NameTable.push_back(*Name);
}		}
		if (!ProfileIsCS)
		MD5SampleContextStart = MD5SampleContextTable.data();
return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code		std::error_code
SampleProfileReaderExtBinaryBase::readNameTableSec(bool IsMD5,		SampleProfileReaderExtBinaryBase::readNameTableSec(bool IsMD5,
bool FixedLengthMD5) {		bool FixedLengthMD5) {
if (FixedLengthMD5) {		if (FixedLengthMD5) {
if (!IsMD5)		if (!IsMD5)
Show All 11 Lines	if (FixedLengthMD5) {
// Preallocate and initialize NameTable so we can check whether a name		// Preallocate and initialize NameTable so we can check whether a name
// index has been read before by checking whether the element in the		// index has been read before by checking whether the element in the
// NameTable is empty, meanwhile readStringIndex can do the boundary		// NameTable is empty, meanwhile readStringIndex can do the boundary
// check using the size of NameTable.		// check using the size of NameTable.
MD5StringBuf.reserve(MD5StringBuf.size() + *Size);		MD5StringBuf.reserve(MD5StringBuf.size() + *Size);
NameTable.clear();		NameTable.clear();
NameTable.resize(*Size);		NameTable.resize(*Size);
MD5NameMemStart = Data;		MD5NameMemStart = Data;
		if (!ProfileIsCS)
		MD5SampleContextStart = reinterpret_cast<const uint64_t *>(Data);
Data = Data + (Size) sizeof(uint64_t);		Data = Data + (Size) sizeof(uint64_t);
return sampleprof_error::success;		return sampleprof_error::success;
}		}

if (IsMD5) {		if (IsMD5) {
assert(!FixedLengthMD5 && "FixedLengthMD5 should be unreachable here");		assert(!FixedLengthMD5 && "FixedLengthMD5 should be unreachable here");
auto Size = readNumber<size_t>();		auto Size = readNumber<size_t>();
if (std::error_code EC = Size.getError())		if (std::error_code EC = Size.getError())
return EC;		return EC;

MD5StringBuf.reserve(MD5StringBuf.size() + *Size);		MD5StringBuf.reserve(MD5StringBuf.size() + *Size);
NameTable.clear();		NameTable.clear();
NameTable.reserve(*Size);		NameTable.reserve(*Size);
		if (!ProfileIsCS)
		MD5SampleContextTable.resize(*Size);
for (size_t I = 0; I < *Size; ++I) {		for (size_t I = 0; I < *Size; ++I) {
auto FID = readNumber<uint64_t>();		auto FID = readNumber<uint64_t>();
if (std::error_code EC = FID.getError())		if (std::error_code EC = FID.getError())
return EC;		return EC;
		if (!ProfileIsCS)
		support::endian::write64le(&MD5SampleContextTable[I], *FID);
NameTable.emplace_back(MD5StringBuf.emplace_back(std::to_string(*FID)));		NameTable.emplace_back(MD5StringBuf.emplace_back(std::to_string(*FID)));
}		}
		if (!ProfileIsCS)
		MD5SampleContextStart = MD5SampleContextTable.data();
return sampleprof_error::success;		return sampleprof_error::success;
}		}

return SampleProfileReaderBinary::readNameTable();		return SampleProfileReaderBinary::readNameTable();
}		}

// Read in the CS name table section, which basically contains a list of context		// Read in the CS name table section, which basically contains a list of context
// vectors. Each element of a context vector, aka a frame, refers to the		// vectors. Each element of a context vector, aka a frame, refers to the
// underlying raw function names that are stored in the name table, as well as		// underlying raw function names that are stored in the name table, as well as
// a callsite identifier that only makes sense for non-leaf frames.		// a callsite identifier that only makes sense for non-leaf frames.
std::error_code SampleProfileReaderExtBinaryBase::readCSNameTableSec() {		std::error_code SampleProfileReaderExtBinaryBase::readCSNameTableSec() {
auto Size = readNumber<size_t>();		auto Size = readNumber<size_t>();
if (std::error_code EC = Size.getError())		if (std::error_code EC = Size.getError())
return EC;		return EC;

CSNameTable.clear();		CSNameTable.clear();
CSNameTable.reserve(*Size);		CSNameTable.reserve(*Size);
		if (ProfileIsCS) {
		// Delay MD5 computation of CS context until they are needed. Use 0 to
		// indicate MD5 value is to be calculated as no known string has a MD5
		// value of 0.
		MD5SampleContextTable.clear();
		MD5SampleContextTable.resize(*Size);
		MD5SampleContextStart = MD5SampleContextTable.data();
		}
for (size_t I = 0; I < *Size; ++I) {		for (size_t I = 0; I < *Size; ++I) {
CSNameTable.emplace_back(SampleContextFrameVector());		CSNameTable.emplace_back(SampleContextFrameVector());
auto ContextSize = readNumber<uint32_t>();		auto ContextSize = readNumber<uint32_t>();
if (std::error_code EC = ContextSize.getError())		if (std::error_code EC = ContextSize.getError())
return EC;		return EC;
for (uint32_t J = 0; J < *ContextSize; ++J) {		for (uint32_t J = 0; J < *ContextSize; ++J) {
auto FName(readStringFromTable());		auto FName(readStringFromTable());
if (std::error_code EC = FName.getError())		if (std::error_code EC = FName.getError())
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	if (!ProfileIsCS) {
auto LineOffset = readNumber<uint64_t>();		auto LineOffset = readNumber<uint64_t>();
if (std::error_code EC = LineOffset.getError())		if (std::error_code EC = LineOffset.getError())
return EC;		return EC;

auto Discriminator = readNumber<uint64_t>();		auto Discriminator = readNumber<uint64_t>();
if (std::error_code EC = Discriminator.getError())		if (std::error_code EC = Discriminator.getError())
return EC;		return EC;

auto FContext(readSampleContextFromTable());		auto FContextHash(readSampleContextFromTable());
if (std::error_code EC = FContext.getError())		if (std::error_code EC = FContextHash.getError())
return EC;		return EC;

		auto &[FContext, Hash] = *FContextHash;
FunctionSamples *CalleeProfile = nullptr;		FunctionSamples *CalleeProfile = nullptr;
if (FProfile) {		if (FProfile) {
CalleeProfile = const_cast<FunctionSamples *>(		CalleeProfile = const_cast<FunctionSamples *>(
&FProfile->functionSamplesAt(LineLocation(		&FProfile->functionSamplesAt(LineLocation(
*LineOffset,		*LineOffset,
*Discriminator))[std::string(FContext.get().getName())]);		*Discriminator))[std::string(FContext.getName())]);
}		}
if (std::error_code EC =		if (std::error_code EC =
readFuncMetadata(ProfileHasAttribute, CalleeProfile))		readFuncMetadata(ProfileHasAttribute, CalleeProfile))
return EC;		return EC;
}		}
}		}
}		}

return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code		std::error_code
SampleProfileReaderExtBinaryBase::readFuncMetadata(bool ProfileHasAttribute) {		SampleProfileReaderExtBinaryBase::readFuncMetadata(bool ProfileHasAttribute) {
while (Data < End) {		while (Data < End) {
auto FContext(readSampleContextFromTable());		auto FContextHash(readSampleContextFromTable());
if (std::error_code EC = FContext.getError())		if (std::error_code EC = FContextHash.getError())
return EC;		return EC;
		auto &[FContext, Hash] = *FContextHash;
FunctionSamples *FProfile = nullptr;		FunctionSamples *FProfile = nullptr;
auto It = Profiles.find(*FContext);		auto It = Profiles.find(FContext);
if (It != Profiles.end())		if (It != Profiles.end())
FProfile = &It->second;		FProfile = &It->second;

if (std::error_code EC = readFuncMetadata(ProfileHasAttribute, FProfile))		if (std::error_code EC = readFuncMetadata(ProfileHasAttribute, FProfile))
return EC;		return EC;
}		}

assert(Data == End && "More data is read than expected");		assert(Data == End && "More data is read than expected");
▲ Show 20 Lines • Show All 667 Lines • Show Last 20 Lines

llvm/lib/ProfileData/SampleProfWriter.cpp

Show First 20 Lines • Show All 351 Lines • ▼ Show 20 Lines	std::error_code SampleProfileWriterExtBinaryBase::writeNameTable() {
stablizeNameTable(NameTable, V);		stablizeNameTable(NameTable, V);

// Write out the MD5 name table. We wrote unencoded MD5 so reader can		// Write out the MD5 name table. We wrote unencoded MD5 so reader can
// retrieve the name using the name index without having to read the		// retrieve the name using the name index without having to read the
// whole name table.		// whole name table.
encodeULEB128(NameTable.size(), OS);		encodeULEB128(NameTable.size(), OS);
support::endian::Writer Writer(OS, support::little);		support::endian::Writer Writer(OS, support::little);
for (auto N : V)		for (auto N : V)
Writer.write(MD5Hash(N));		Writer.write(hashFuncName(N));
return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code SampleProfileWriterExtBinaryBase::writeNameTableSection(		std::error_code SampleProfileWriterExtBinaryBase::writeNameTableSection(
const SampleProfileMap &ProfileMap) {		const SampleProfileMap &ProfileMap) {
for (const auto &I : ProfileMap) {		for (const auto &I : ProfileMap) {
assert(I.first == I.second.getContext() && "Inconsistent profile map");
addContext(I.second.getContext());		addContext(I.second.getContext());
addNames(I.second);		addNames(I.second);
}		}

// If NameTable contains ".__uniq." suffix, set SecFlagUniqSuffix flag		// If NameTable contains ".__uniq." suffix, set SecFlagUniqSuffix flag
// so compiler won't strip the suffix during profile matching after		// so compiler won't strip the suffix during profile matching after
// seeing the flag in the profile.		// seeing the flag in the profile.
for (const auto &I : NameTable) {		for (const auto &I : NameTable) {
▲ Show 20 Lines • Show All 345 Lines • ▼ Show 20 Lines	SampleProfileWriterBinary::writeHeader(const SampleProfileMap &ProfileMap) {
writeMagicIdent(Format);		writeMagicIdent(Format);

computeSummary(ProfileMap);		computeSummary(ProfileMap);
if (auto EC = writeSummary())		if (auto EC = writeSummary())
return EC;		return EC;

// Generate the name table for all the functions referenced in the profile.		// Generate the name table for all the functions referenced in the profile.
for (const auto &I : ProfileMap) {		for (const auto &I : ProfileMap) {
assert(I.first == I.second.getContext() && "Inconsistent profile map");		addContext(I.second.getContext());
addContext(I.first);
addNames(I.second);		addNames(I.second);
}		}

writeNameTable();		writeNameTable();
return sampleprof_error::success;		return sampleprof_error::success;
}		}

void SampleProfileWriterExtBinaryBase::setToCompressAllSections() {		void SampleProfileWriterExtBinaryBase::setToCompressAllSections() {
▲ Show 20 Lines • Show All 198 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/SampleContextTracker.cpp

	Show First 20 Lines • Show All 195 Lines • ▼ Show 20 Lines

	// Profiler tracker than manages profiles and its associated context			// Profiler tracker than manages profiles and its associated context
	SampleContextTracker::SampleContextTracker(			SampleContextTracker::SampleContextTracker(
	SampleProfileMap &Profiles,			SampleProfileMap &Profiles,
	const DenseMap<uint64_t, StringRef> *GUIDToFuncNameMap)			const DenseMap<uint64_t, StringRef> *GUIDToFuncNameMap)
	: GUIDToFuncNameMap(GUIDToFuncNameMap) {			: GUIDToFuncNameMap(GUIDToFuncNameMap) {
	for (auto &FuncSample : Profiles) {			for (auto &FuncSample : Profiles) {
	FunctionSamples *FSamples = &FuncSample.second;			FunctionSamples *FSamples = &FuncSample.second;
	SampleContext Context = FuncSample.first;			SampleContext Context = FuncSample.second.getContext();
	LLVM_DEBUG(dbgs() << "Tracking Context for function: " << Context.toString()			LLVM_DEBUG(dbgs() << "Tracking Context for function: " << Context.toString()
	<< "\n");			<< "\n");
	ContextTrieNode *NewNode = getOrCreateContextPath(Context, true);			ContextTrieNode *NewNode = getOrCreateContextPath(Context, true);
	assert(!NewNode->getFunctionSamples() &&			assert(!NewNode->getFunctionSamples() &&
	"New node can't have sample profile");			"New node can't have sample profile");
	NewNode->setFunctionSamples(FSamples);			NewNode->setFunctionSamples(FSamples);
	}			}
	populateFuncToCtxtMap();			populateFuncToCtxtMap();
	▲ Show 20 Lines • Show All 420 Lines • ▼ Show 20 Lines
	}			}

	void SampleContextTracker::createContextLessProfileMap(			void SampleContextTracker::createContextLessProfileMap(
	SampleProfileMap &ContextLessProfiles) {			SampleProfileMap &ContextLessProfiles) {
	for (auto Node : this) {			for (auto Node : this) {
	FunctionSamples *FProfile = Node->getFunctionSamples();			FunctionSamples *FProfile = Node->getFunctionSamples();
	// Profile's context can be empty, use ContextNode's func name.			// Profile's context can be empty, use ContextNode's func name.
	if (FProfile)			if (FProfile)
	ContextLessProfiles[Node->getFuncName()].merge(*FProfile);			ContextLessProfiles.Create(Node->getFuncName()).merge(*FProfile);
	}			}
	}			}
	} // namespace llvm			} // namespace llvm

llvm/test/tools/llvm-profdata/Inputs/sample-nametable-after-samples.profdata

This is a binary file.

llvm/test/tools/llvm-profdata/sample-nametable.test

	Test several edge cases with unusual name table data in ExtBinary format.			Test several edge cases with unusual name table data in ExtBinary format.

	1- Multiple fixed-length MD5 name tables. Reading a new table should clear the content from old table, and a valid name index for the old name table should become invalid if the new name table has fewer entries.			1- Multiple fixed-length MD5 name tables. Reading a new table should clear the content from old table, and a valid name index for the old name table should become invalid if the new name table has fewer entries.
	RUN: not llvm-profdata show --sample %p/Inputs/sample-multiple-nametables.profdata			RUN: not llvm-profdata show --sample %p/Inputs/sample-multiple-nametables.profdata

	2- Multiple name tables, the first one has an empty string, the second one tricks the reader into expecting fixed-length MD5 values. Reader should not attempt "lazy loading" of the MD5 string in this case.			2- Multiple name tables, the first one has an empty string, the second one tricks the reader into expecting fixed-length MD5 values. Reader should not attempt "lazy loading" of the MD5 string in this case.
	RUN: not llvm-profdata show --sample %p/Inputs/sample-nametable-empty-string.profdata			RUN: not llvm-profdata show --sample %p/Inputs/sample-nametable-empty-string.profdata

	3- The data of the name table is placed after the data of the profiles. The reader should handle it correctly.			3- The data of the name table is placed after the data of the profiles. The reader should handle it correctly.
	RUN: llvm-profdata merge --sample --text %p/Inputs/sample-nametable-after-samples.profdata \| FileCheck %s			RUN: llvm-profdata merge --sample --text %p/Inputs/sample-nametable-after-samples.profdata \| FileCheck %s
	CHECK: 18446744073709551615:2:9			CHECK: 18446744073709551613:2:9
				huangjdAuthorUnsubmitted Done Reply Inline Actions Note: 0xFFFFFFFFFFFFFFFF and 0xFFFFFFFFFFFFFFFE are reserved in llvm::DenseMap and cannot be used as key. Changing it to 0xFFFFFFFFFFFFFFFD. I am not adding a check in SampleProfileMap to check for them because the assumption that a hash value is never equal them is made in so many places throughout LLVM, any check should be done inside DenseMap if actually needed. huangjd: Note: 0xFFFFFFFFFFFFFFFF and 0xFFFFFFFFFFFFFFFE are reserved in llvm::DenseMap and cannot be…

llvm/tools/llvm-profdata/llvm-profdata.cpp

Show First 20 Lines • Show All 588 Lines • ▼ Show 20 Lines	adjustInstrProfile(std::unique_ptr<WriterContext> &WC,
unsigned InstrProfColdThreshold) {		unsigned InstrProfColdThreshold) {
// Function to its entry in instr profile.		// Function to its entry in instr profile.
StringMap<InstrProfileEntry> InstrProfileMap;		StringMap<InstrProfileEntry> InstrProfileMap;
StringMap<StringRef> StaticFuncMap;		StringMap<StringRef> StaticFuncMap;
InstrProfSummaryBuilder IPBuilder(ProfileSummaryBuilder::DefaultCutoffs);		InstrProfSummaryBuilder IPBuilder(ProfileSummaryBuilder::DefaultCutoffs);

auto checkSampleProfileHasFUnique = [&Reader]() {		auto checkSampleProfileHasFUnique = [&Reader]() {
for (const auto &PD : Reader->getProfiles()) {		for (const auto &PD : Reader->getProfiles()) {
auto &FContext = PD.first;		auto &FContext = PD.second.getContext();
if (FContext.toString().find(FunctionSamples::UniqSuffix) !=		if (FContext.toString().find(FunctionSamples::UniqSuffix) !=
std::string::npos) {		std::string::npos) {
return true;		return true;
}		}
}		}
return false;		return false;
};		};

▲ Show 20 Lines • Show All 2,227 Lines • ▼ Show 20 Lines	else
Reader->dump(OS);		Reader->dump(OS);
} else {		} else {
if (SFormat == ShowFormat::Json)		if (SFormat == ShowFormat::Json)
exitWithError(		exitWithError(
"the JSON format is supported only when all functions are to "		"the JSON format is supported only when all functions are to "
"be printed");		"be printed");

// TODO: parse context string to support filtering by contexts.		// TODO: parse context string to support filtering by contexts.
Reader->dumpFunctionProfile(StringRef(ShowFunction), OS);		FunctionSamples *FS = Reader->getSamplesFor(StringRef(ShowFunction));
		Reader->dumpFunctionProfile(FS ? *FS : FunctionSamples(), OS);
}		}

if (ShowProfileSymbolList) {		if (ShowProfileSymbolList) {
std::unique_ptr<sampleprof::ProfileSymbolList> ReaderList =		std::unique_ptr<sampleprof::ProfileSymbolList> ReaderList =
Reader->getProfileSymbolList();		Reader->getProfileSymbolList();
ReaderList->dump(OS);		ReaderList->dump(OS);
}		}

▲ Show 20 Lines • Show All 306 Lines • Show Last 20 Lines

llvm/tools/llvm-profgen/ProfileGenerator.cpp

Show First 20 Lines • Show All 443 Lines • ▼ Show 20 Lines	for (const auto &CI : *SampleCounters) {
}		}
}		}
return true;		return true;
}		}

bool ProfileGenerator::collectFunctionsFromLLVMProfile(		bool ProfileGenerator::collectFunctionsFromLLVMProfile(
std::unordered_set<const BinaryFunction *> &ProfiledFunctions) {		std::unordered_set<const BinaryFunction *> &ProfiledFunctions) {
for (const auto &FS : ProfileMap) {		for (const auto &FS : ProfileMap) {
if (auto *Func = Binary->getBinaryFunction(FS.first.getName()))		if (auto *Func = Binary->getBinaryFunction(FS.second.getName()))
ProfiledFunctions.insert(Func);		ProfiledFunctions.insert(Func);
}		}
return true;		return true;
}		}

bool CSProfileGenerator::collectFunctionsFromLLVMProfile(		bool CSProfileGenerator::collectFunctionsFromLLVMProfile(
std::unordered_set<const BinaryFunction *> &ProfiledFunctions) {		std::unordered_set<const BinaryFunction *> &ProfiledFunctions) {
for (auto *Node : ContextTracker) {		for (auto *Node : ContextTracker) {
if (!Node->getFuncName().empty())		if (!Node->getFuncName().empty())
if (auto *Func = Binary->getBinaryFunction(Node->getFuncName()))		if (auto *Func = Binary->getBinaryFunction(Node->getFuncName()))
ProfiledFunctions.insert(Func);		ProfiledFunctions.insert(Func);
}		}
return true;		return true;
}		}

FunctionSamples &		FunctionSamples &
ProfileGenerator::getTopLevelFunctionProfile(StringRef FuncName) {		ProfileGenerator::getTopLevelFunctionProfile(StringRef FuncName) {
SampleContext Context(FuncName);		SampleContext Context(FuncName);
auto Ret = ProfileMap.emplace(Context, FunctionSamples());		return ProfileMap.Create(Context);
if (Ret.second) {
FunctionSamples &FProfile = Ret.first->second;
FProfile.setContext(Context);
}
return Ret.first->second;
}		}

void ProfileGenerator::generateProfile() {		void ProfileGenerator::generateProfile() {
collectProfiledFunctions();		collectProfiledFunctions();

if (Binary->usePseudoProbes())		if (Binary->usePseudoProbes())
Binary->decodePseudoProbe();		Binary->decodePseudoProbe();

Show All 15 Lines
}		}

void ProfileGenerator::trimColdProfiles(const SampleProfileMap &Profiles,		void ProfileGenerator::trimColdProfiles(const SampleProfileMap &Profiles,
uint64_t ColdCntThreshold) {		uint64_t ColdCntThreshold) {
if (!TrimColdProfile)		if (!TrimColdProfile)
return;		return;

// Move cold profiles into a tmp container.		// Move cold profiles into a tmp container.
std::vector<SampleContext> ColdProfiles;		std::vector<hash_code> ColdProfileHashes;
for (const auto &I : ProfileMap) {		for (const auto &I : ProfileMap) {
if (I.second.getTotalSamples() < ColdCntThreshold)		if (I.second.getTotalSamples() < ColdCntThreshold)
ColdProfiles.emplace_back(I.first);		ColdProfileHashes.emplace_back(I.first);
}		}

// Remove the cold profile from ProfileMap.		// Remove the cold profile from ProfileMap.
for (const auto &I : ColdProfiles)		for (const auto &I : ColdProfileHashes)
ProfileMap.erase(I);		ProfileMap.erase(I);
}		}

void ProfileGenerator::generateLineNumBasedProfile() {		void ProfileGenerator::generateLineNumBasedProfile() {
assert(SampleCounters->size() == 1 &&		assert(SampleCounters->size() == 1 &&
"Must have one entry for profile generation.");		"Must have one entry for profile generation.");
const SampleCounter &SC = SampleCounters->begin()->second;		const SampleCounter &SC = SampleCounters->begin()->second;
// Fill in function body samples		// Fill in function body samples
▲ Show 20 Lines • Show All 435 Lines • ▼ Show 20 Lines

void CSProfileGenerator::convertToProfileMap(		void CSProfileGenerator::convertToProfileMap(
ContextTrieNode &Node, SampleContextFrameVector &Context) {		ContextTrieNode &Node, SampleContextFrameVector &Context) {
FunctionSamples *FProfile = Node.getFunctionSamples();		FunctionSamples *FProfile = Node.getFunctionSamples();
if (FProfile) {		if (FProfile) {
Context.emplace_back(Node.getFuncName(), LineLocation(0, 0));		Context.emplace_back(Node.getFuncName(), LineLocation(0, 0));
// Save the new context for future references.		// Save the new context for future references.
SampleContextFrames NewContext = *Contexts.insert(Context).first;		SampleContextFrames NewContext = *Contexts.insert(Context).first;
auto Ret = ProfileMap.emplace(NewContext, std::move(*FProfile));		auto Ret = ProfileMap.emplace(NewContext, std::move(*FProfile));
		wleiUnsubmitted Done Reply Inline Actions Here for profile generation time, if hash collision happens, the sample is lost. wlei: Here for profile generation time, if hash collision happens, the sample is lost.
FunctionSamples &NewProfile = Ret.first->second;		FunctionSamples &NewProfile = Ret.first->second;
NewProfile.getContext().setContext(NewContext);		NewProfile.getContext().setContext(NewContext);
Context.pop_back();		Context.pop_back();
}		}

for (auto &It : Node.getAllChildContext()) {		for (auto &It : Node.getAllChildContext()) {
ContextTrieNode &ChildNode = It.second;		ContextTrieNode &ChildNode = It.second;
Context.emplace_back(Node.getFuncName(), ChildNode.getCallSiteLoc());		Context.emplace_back(Node.getFuncName(), ChildNode.getCallSiteLoc());
Show All 37 Lines	if (TrimColdProfile \|\| CSProfMergeColdContext) {
SampleContextTrimmer(ProfileMap)		SampleContextTrimmer(ProfileMap)
.trimAndMergeColdContextProfiles(		.trimAndMergeColdContextProfiles(
HotCountThreshold, TrimColdProfile, CSProfMergeColdContext,		HotCountThreshold, TrimColdProfile, CSProfMergeColdContext,
CSProfMaxColdContextDepth, EnableCSPreInliner);		CSProfMaxColdContextDepth, EnableCSPreInliner);
}		}

// Merge function samples of CS profile to calculate profile density.		// Merge function samples of CS profile to calculate profile density.
sampleprof::SampleProfileMap ContextLessProfiles;		sampleprof::SampleProfileMap ContextLessProfiles;
for (const auto &I : ProfileMap) {		ProfileConverter::flattenProfile(ProfileMap, ContextLessProfiles, true);
ContextLessProfiles[I.second.getName()].merge(I.second);
}

calculateAndShowDensity(ContextLessProfiles);		calculateAndShowDensity(ContextLessProfiles);
if (GenCSNestedProfile) {		if (GenCSNestedProfile) {
ProfileConverter CSConverter(ProfileMap);		ProfileConverter CSConverter(ProfileMap);
CSConverter.convertCSProfiles();		CSConverter.convertCSProfiles();
FunctionSamples::ProfileIsCS = false;		FunctionSamples::ProfileIsCS = false;
}		}
}		}
▲ Show 20 Lines • Show All 243 Lines • Show Last 20 Lines

llvm/unittests/tools/llvm-profdata/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	Core			Core
	ProfileData			ProfileData
	Support			Support
	)			)

	add_llvm_unittest(LLVMProfdataTests			add_llvm_unittest(LLVMProfdataTests
	OutputSizeLimitTest.cpp			OutputSizeLimitTest.cpp
				MD5CollisionTest.cpp
	)			)

	target_link_libraries(LLVMProfdataTests PRIVATE LLVMTestingSupport)			target_link_libraries(LLVMProfdataTests PRIVATE LLVMTestingSupport)

	set_property(TARGET LLVMProfdataTests PROPERTY FOLDER "Tests/UnitTests/ToolTests")			set_property(TARGET LLVMProfdataTests PROPERTY FOLDER "Tests/UnitTests/ToolTests")

llvm/unittests/tools/llvm-profdata/MD5CollisionTest.cpp

This file was added.

				//===- llvm/unittests/tools/llvm-profdata/MD5CollisionTest.cpp ------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				/// Test whether the MD5-key SampleProfileMap can handle collision correctly.
				/// Probability of collision is rare but not negligible since we only use the
				/// lower 64 bits of the MD5 value. A unit test is required because the function
				/// names are not printable ASCII characters.

				#include "llvm/ProfileData/SampleProfReader.h"
				#include "llvm/Support/VirtualFileSystem.h"
				#include "llvm/Testing/Support/Error.h"
				#include "gtest/gtest.h"

				/// According to https://en.wikipedia.org/wiki/MD5#Preimage_vulnerability, the
				/// MD5 of the two strings are 79054025255fb1a26e4bc422aef54eb4.

				// First 8 bytes of the MD5.
				const uint64_t ExpectedHash = 0xa2b15f2525400579;

				// clang-format off
				const uint8_t ProfileData[] = {
				0x84, 0xe4, 0xd0, 0xb1, 0xf4, 0xc9, 0x94, 0xa8,
				0x53, 0x67, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x7D, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x03, 0x01, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x80, 0x01, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x90, 0x01, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x00, 0x00, 0x00,

				/// Name Table
				0x02,
				/// String1
				0xd1, 0x31, 0xdd, 0x02, 0xc5, 0xe6, 0xee, 0xc4,
				0x69, 0x3d, 0x9a, 0x06, 0x98, 0xaf, 0xf9, 0x5c,
				0x2f, 0xca, 0xb5, 0x87, 0x12, 0x46, 0x7e, 0xab,
				0x40, 0x04, 0x58, 0x3e, 0xb8, 0xfb, 0x7f, 0x89,
				0x55, 0xad, 0x34, 0x06, 0x09, 0xf4, 0xb3, 0x02,
				0x83, 0xe4, 0x88, 0x83, 0x25, 0x71, 0x41, 0x5a,
				0x08, 0x51, 0x25, 0xe8, 0xf7, 0xcd, 0xc9, 0x9f,
				0xd9, 0x1d, 0xbd, 0xf2, 0x80, 0x37, 0x3c, 0x5b,
				0xd8, 0x82, 0x3e, 0x31, 0x56, 0x34, 0x8f, 0x5b,
				0xae, 0x6d, 0xac, 0xd4, 0x36, 0xc9, 0x19, 0xc6,
				0xdd, 0x53, 0xe2, 0xb4, 0x87, 0xda, 0x03, 0xfd,
				0x02, 0x39, 0x63, 0x06, 0xd2, 0x48, 0xcd, 0xa0,
				0xe9, 0x9f, 0x33, 0x42, 0x0f, 0x57, 0x7e, 0xe8,
				0xce, 0x54, 0xb6, 0x70, 0x80, 0xa8, 0x0d, 0x1e,
				0xc6, 0x98, 0x21, 0xbc, 0xb6, 0xa8, 0x83, 0x93,
				0x96, 0xf9, 0x65, 0x2b, 0x6f, 0xf7, 0x2a, 0x70, 0x00,
				/// String2
				0xd1, 0x31, 0xdd, 0x02, 0xc5, 0xe6, 0xee, 0xc4,
				0x69, 0x3d, 0x9a, 0x06, 0x98, 0xaf, 0xf9, 0x5c,
				0x2f, 0xca, 0xb5, 0x07, 0x12, 0x46, 0x7e, 0xab,
				0x40, 0x04, 0x58, 0x3e, 0xb8, 0xfb, 0x7f, 0x89,
				0x55, 0xad, 0x34, 0x06, 0x09, 0xf4, 0xb3, 0x02,
				0x83, 0xe4, 0x88, 0x83, 0x25, 0xf1, 0x41, 0x5a,
				0x08, 0x51, 0x25, 0xe8, 0xf7, 0xcd, 0xc9, 0x9f,
				0xd9, 0x1d, 0xbd, 0x72, 0x80, 0x37, 0x3c, 0x5b,
				0xd8, 0x82, 0x3e, 0x31, 0x56, 0x34, 0x8f, 0x5b,
				0xae, 0x6d, 0xac, 0xd4, 0x36, 0xc9, 0x19, 0xc6,
				0xdd, 0x53, 0xe2, 0x34, 0x87, 0xda, 0x03, 0xfd,
				0x02, 0x39, 0x63, 0x06, 0xd2, 0x48, 0xcd, 0xa0,
				0xe9, 0x9f, 0x33, 0x42, 0x0f, 0x57, 0x7e, 0xe8,
				0xce, 0x54, 0xb6, 0x70, 0x80, 0x28, 0x0d, 0x1e,
				0xc6, 0x98, 0x21, 0xbc, 0xb6, 0xa8, 0x83, 0x93,
				0x96, 0xf9, 0x65, 0xab, 0x6f, 0xf7, 0x2a, 0x70, 0x00,

				/// FuncOffsetTable
				0x02, 0x00, 0x00, 0x01, 0x17, 0x00, 0x00, 0x00,
				0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,

				/// Samples
				snehasishUnsubmitted Not Done Reply Inline Actions Can we use the text format (with some additional helper functions) here instead of the binary data? It would be hard to update in case of changes in the future. snehasish: Can we use the text format (with some additional helper functions) here instead of the binary…
				huangjdAuthorUnsubmitted Done Reply Inline Actions See comments ... A unit test is required because the function /// names are not printable ASCII characters. huangjd: See comments ``` ... A unit test is required because the function /// names are not printable…
				huangjdAuthorUnsubmitted Done Reply Inline Actions This test case is very theoretical, as I can't find two printable ASCII strings with colliding MD5 (there exists for sure, but none is known yet) In this case I have to use a unit test because I cannot validate the output using llvm-lit which requires printable characters. huangjd: This test case is very theoretical, as I can't find two printable ASCII strings with colliding…
				snehasishUnsubmitted Not Done Reply Inline Actions test is required because the function names below (i.e. String1 and String2) are not printable ASCII characters. I see, I got confused because I thought "String1" is used in a literal sense. Perhaps enhance the comments above to note this? snehasish: > test is required because the function > names below (i.e. String1 and String2) are not…
				/// String1:10:1
				/// 1: 5
				/// 2.3: 6
				/// 4: String2:100
				/// 1: 100
				/// String2:7:3
				/// 9: 0
				0x01, 0x00, 0x0a, 0x02, 0x01, 0x00, 0x05, 0x00,
				0x02, 0x03, 0x06, 0x00, 0x01, 0x04, 0x00, 0x01,
				0x64, 0x01, 0x01, 0x00, 0x64, 0x00, 0x00,

				0x03, 0x01, 0x07, 0x01, 0x09, 0x00, 0x00, 0x00,
				0x00};
				// clang-format on

				using namespace llvm;
				using namespace llvm::sampleprof;

				TEST(MD5CollisionTest, TestCollision) {
				auto InputBuffer = MemoryBuffer::getMemBuffer(
				StringRef(reinterpret_cast<const char *>(ProfileData),
				sizeof(ProfileData)),
				"", false);
				LLVMContext Context;
				auto FileSystem = vfs::getRealFileSystem();
				auto Result = SampleProfileReader::create(InputBuffer, Context, *FileSystem);
				ASSERT_TRUE(Result);
				SampleProfileReader *Reader = Result->get();
				ASSERT_FALSE(Reader->read());

				std::vector<StringRef> &NameTable = *Reader->getNameTable();
				ASSERT_EQ(NameTable.size(), 2U);
				snehasishUnsubmitted Done Reply Inline Actions This should be assert since if it doesn't hold the following lines which deref NameTable[0] and NameTable[1] will segfault. snehasish: This should be assert since if it doesn't hold the following lines which deref NameTable[0] and…
				StringRef S1 = NameTable[0];
				StringRef S2 = NameTable[1];
				ASSERT_NE(S1, S2);
				ASSERT_EQ(MD5Hash(S1), ExpectedHash);
				ASSERT_EQ(MD5Hash(S2), ExpectedHash);

				// S2's MD5 value collides with S1, S1 is expected to be dropped when S2 is
				// inserted, as if S1 never existed.

				FunctionSamples ExpectedFS;
				ExpectedFS.setName(S2);
				ExpectedFS.setHeadSamples(3);
				ExpectedFS.setTotalSamples(7);
				ExpectedFS.addBodySamples(9, 0, 0);

				SampleProfileMap &Profiles = Reader->getProfiles();
				EXPECT_EQ(Profiles.size(), 1U);
				if (Profiles.size()) {
				auto &[Hash, FS] = *Profiles.begin();
				EXPECT_EQ(Hash, hash_code(ExpectedHash));
				EXPECT_EQ(FS, ExpectedFS);
				}

				// Inserting S2 again should fail, returning the existing sample unchanged.
				snehasishUnsubmitted Done Reply Inline Actions Capture with structured binding here with clearer variable names to make it easier to read? snehasish: Capture with structured binding here with clearer variable names to make it easier to read?
				auto [It1, Inserted1] = Profiles.try_emplace(S2, FunctionSamples());
				EXPECT_FALSE(Inserted1);
				EXPECT_EQ(Profiles.size(), 1U);
				if (Profiles.size()) {
				auto &[Hash, FS] = *It1;
				EXPECT_EQ(Hash, hash_code(ExpectedHash));
				EXPECT_EQ(FS, ExpectedFS);
				}

				// Inserting S1 should success as if S2 never existed, and S2 is erased.
				FunctionSamples FS1;
				FS1.setName(S1);
				FS1.setHeadSamples(5);
				FS1.setTotalSamples(10);
				FS1.addBodySamples(1, 2, 5);

				auto [It2, Inserted2] = Profiles.try_emplace(S1, FS1);
				EXPECT_TRUE(Inserted2);
				EXPECT_EQ(Profiles.size(), 1U);
				if (Profiles.size()) {
				auto &[Hash, FS] = *It2;
				EXPECT_EQ(Hash, hash_code(ExpectedHash));
				EXPECT_EQ(FS, FS1);
				}
				}

llvm/unittests/tools/llvm-profdata/OutputSizeLimitTest.cpp

Show First 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	static ExpectedErrorOr<void *> RunTest(StringRef Input, size_t SizeLimit,
// Check temp file is actually within size limit.		// Check temp file is actually within size limit.
uint64_t FileSize;		uint64_t FileSize;
RETURN_IF_ERROR(sys::fs::file_size(Temp.path(), FileSize));		RETURN_IF_ERROR(sys::fs::file_size(Temp.path(), FileSize));
EXPECT_LE(FileSize, SizeLimit);		EXPECT_LE(FileSize, SizeLimit);

// For every sample in the new profile, confirm it is in the old profile and		// For every sample in the new profile, confirm it is in the old profile and
// unchanged.		// unchanged.
for (auto Sample : NewProfiles) {		for (auto Sample : NewProfiles) {
auto FindResult = OldProfiles.find(Sample.first);		auto FindResult = OldProfiles.find(Sample.second.getContext());
EXPECT_NE(FindResult, OldProfiles.end());		EXPECT_NE(FindResult, OldProfiles.end());
if (FindResult != OldProfiles.end()) {		if (FindResult != OldProfiles.end()) {
EXPECT_EQ(Sample.second.getHeadSamples(),		EXPECT_EQ(Sample.second.getHeadSamples(),
FindResult->second.getHeadSamples());		FindResult->second.getHeadSamples());
EXPECT_EQ(Sample.second, FindResult->second);		EXPECT_EQ(Sample.second, FindResult->second);
}		}
}		}
return nullptr;		return nullptr;
Show All 33 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile mapClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 551240

llvm/include/llvm/ProfileData/SampleProf.h

llvm/include/llvm/ProfileData/SampleProfReader.h

llvm/lib/ProfileData/ProfileSummaryBuilder.cpp

llvm/lib/ProfileData/SampleProf.cpp

llvm/lib/ProfileData/SampleProfReader.cpp

llvm/lib/ProfileData/SampleProfWriter.cpp

llvm/lib/Transforms/IPO/SampleContextTracker.cpp

llvm/test/tools/llvm-profdata/Inputs/sample-nametable-after-samples.profdata

llvm/test/tools/llvm-profdata/sample-nametable.test

llvm/tools/llvm-profdata/llvm-profdata.cpp

llvm/tools/llvm-profgen/ProfileGenerator.cpp

llvm/unittests/tools/llvm-profdata/CMakeLists.txt

llvm/unittests/tools/llvm-profdata/MD5CollisionTest.cpp

llvm/unittests/tools/llvm-profdata/OutputSizeLimitTest.cpp

[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map
ClosedPublic