Download Raw Diff

Details

Reviewers

tejohnson
pcc
mehdi_amini

Commits

rG8f299fd2cff7: Fixed false ThinLTO cache misses problem (PR 45819).
rG252892fea708: Fixed false ThinLTO cache misses problem (PR 45819).

Summary

See PR 45819 for additional information (such as how to reproduce the bug, problem analysis etc).
https://bugs.llvm.org/show_bug.cgi?id=45819

We noticed that when building an executable with ThinLTO (cache enabled), from time to time unexpected cache misses happening and new cache entries are generated when not needed. It doesn't happen in a predictable manner (i.e. a cache miss might or might not happen).

‘ExportList’(‘DenseSet’ of ValueInfo's) is used for the calculation of a hash value (LTO Cache Key).
Though the elements of this DenseSet are the same all the time, the order in which the iterator walks through the elements of this set might (or might not) be different the next time when we relink with ThinLTO. If the order happens to be different, we will generate a different hash value and a cache miss will happen.

Looking at the implementation of:

template <> struct DenseMapInfo<ValueInfo> (see ModuleSummaryIndex.h)

we notice that the hash value that is being used by a DenseMap is a pointer:

static unsigned getHashValue(ValueInfo I) { return (uintptr_t)I.getRef(); }

We cannot guarantee that the compiler will allocate memory at the same
address for the object when we run the executable for the second time.

The same problem is applicable to ImportList as well.

A proposed solution here is:
(a) Create a vector of Export List's GUID (int64 integers) and sort this vector. Iterate through this sorted vector and use its elements for cache entry value calculations.
(b) Create a vector of ImportList's ModuleIDs (StringRefs) and sort this vector in lexicographical order. Iterate through this sorted vector. Use ModuleID as a key for obtaining an iterator to corresponding element of the ImportList, which has StringMap type. Use ModuleID (first element of the pair) and the list of GUIDs of all functions to import for this module (second element of the pair) for cache entry value calculations.

Note: it looks like I just spotted another potential problem in here. If confirmed, I will do a separate fix for it.
We use unordered_set container for storing GUIDs of all the functions to import for a source module.
using FunctionsToImportTy = std::unordered_set<GlobalValue::GUID>;
That means that in theory, when when we iterate through unordered_set for the second time, we could get a different order of the elements in this set, computing a different hash value and causing a potential cache miss.

Diff Detail

Event Timeline

kromanova created this revision.May 12 2020, 3:03 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 12 2020, 3:03 AM

Herald added subscribers: llvm-commits, dang, dexonsmith and 4 others. · View Herald Transcript

Harbormaster completed remote builds in B56415: Diff 263334.May 12 2020, 4:15 AM

kromanova added subscribers: tejohnson, mehdi_amini, pcc.May 19 2020, 3:12 PM

LGTM

FYI with respect to:

we notice that the hash value that is being used by a DenseMap is a pointer:

static unsigned getHashValue(ValueInfo I) { return (uintptr_t)I.getRef(); }

We cannot guarantee that the compiler will allocate memory at the same
address for the object when we run the executable for the second time.

DenseMap is not an ordered container, we can't rely on its ordering regardless of the hash function. There is even a build-time option (-DLLVM_REVERSE_ITERATION=ON) to reverse the iteration order of such container to ensure we don't accidentally depend on it: https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Support/ReverseIteration.h

llvm/lib/LTO/LTO.cpp
150	Typo: "elemets"
151	I think you can just run `llvm::sort(ExportsGUID);`? We don't need a "stable sort" here (can't tell the difference anyway right?)
160	Can you reserve here?
165	I think is can just be `sort(ImportModulesVector);` here? (llvm::sort)
168	What about storing in the vector a reference to the entry in the first place and avoid the extra lookup here? using MapEntryTy = FunctionImporter::ImportMapTy::MapEntryTy; std::vector<MapEntryTy > ImportModulesVector; ImportModulesVector.reserve(ImportList.size()); for (auto &Entry : ImportList) ImportModulesVector.push_back(&Entry); llvm::sort(ImportModulesVector, [] (const MapEntryTy Lhs, const MapEntryTy Rhs) { return Lhs < Rhs; }); for (MapEntryTy Entry : ImportModulesVector) { ...

This revision is now accepted and ready to land.May 19 2020, 9:06 PM

Updated the code to fix code review comments.

llvm/lib/LTO/LTO.cpp
168	Good catch! I had to make a few changes to the code that you are suggesting. The rest of the code stayed the same. I will upload the updated patch to phabricator for one more round of reviews. (1) I added const qualifier for Entry (auto &Entry : ImportList) -> (const auto &Entry : ImportList), otherwise this code doesn't compile. (2) Same for (MapEntryTy Entry : ImportModulesVector)-> (const MapEntryTy Entry : ImportModulesVector) (3) I assume that you didn't mean the compare the pointers here? llvm::sort(ImportModulesVector, [] (const MapEntryTy Lhs, const MapEntryTy Rhs) { return Lhs < Rhs; I changed it to llvm::sort(ImportModulesVector, [](const MapEntryTy Lhs, const MapEntryTy Rhs) { return Lhs->getKey() < Rhs->getKey(); });

I accidentally deleted a couple of important lines in previous patch. Please ignore it and look at the latest patch only.

Harbormaster failed remote builds in B58383: Diff 267133!May 29 2020, 1:35 AM

Instead of storing pointers in the vector as you suggested, I think it will be better to store iterators. See the code snippet below.
This approach better fits C++ style. For example, this will guarantee that if operator -> for type pointer to MapEntryTy will be overloaded, it will not affect the logic of the code.
What do you think? If you agree, I will submit a new revision with these changes.

// Include the hash for every module we import functions from. The set of
// imported symbols for each module may affect code generation and is
// sensitive to link order, so include that as well.
using ImportMapIteratorTy = FunctionImporter::ImportMapTy::const_iterator;
std::vector<ImportMapIteratorTy> ImportModulesVector;
ImportModulesVector.reserve(ImportList.size());

for (ImportMapIteratorTy it = ImportList.begin(); it != ImportList.end();
     ++it) {
  ImportModulesVector.push_back(it);
}
llvm::sort(
    ImportModulesVector,
    [](const ImportMapIteratorTy &Lhs, const ImportMapIteratorTy &Rhs) -> bool {
      return Lhs->getKey() < Rhs->getKey();
    });
for (const ImportMapIteratorTy &EntryIt : ImportModulesVector) {
  auto ModHash = Index.getModuleHash(EntryIt->first());
  Hasher.update(ArrayRef<uint8_t>((uint8_t *)&ModHash[0], sizeof(ModHash)));

  AddUint64(EntryIt->second.size());
  for (auto &Fn : EntryIt->second)
    AddUint64(Fn);
}

LGTM

In D79772#2062317, @kromanova wrote:

Instead of storing pointers in the vector as you suggested, I think it will be better to store iterators.

Either way is fine with me

Updated the patch to use iterators instead of pointers.
No need to review since Mehdi looked and approved already.

Harbormaster failed remote builds in B58495: Diff 267367!May 29 2020, 3:20 PM

Closed by commit rG252892fea708: Fixed false ThinLTO cache misses problem (PR 45819). (authored by kromanova). · Explain WhyJun 10 2020, 12:49 PM

This revision was automatically updated to reflect the committed changes.

This is an archive of the discontinued LLVM Phabricator instance.

Nondeterminism of iterators causes false ThinLTO cache misses
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 263334

llvm/lib/LTO/LTO.cpp

This is an archive of the discontinued LLVM Phabricator instance.

Nondeterminism of iterators causes false ThinLTO cache misses ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 263334

llvm/lib/LTO/LTO.cpp

Nondeterminism of iterators causes false ThinLTO cache misses
ClosedPublic