This is an archive of the discontinued LLVM Phabricator instance.

[clang][deps] Ensure filesystem cache consistency
ClosedPublic

Authored by jansvoboda11 on Dec 2 2021, 9:29 AM.

Details

Summary

The minimizing filesystem used by the dependency scanner isn't great when it comes to the consistency of its caches. There are two problems that can be exposed by a filesystem that changes during dependency scan:

  1. In-memory cache entries for original and minimized files are distinct, populated at different times using separate stat/open syscalls. This means that when a file is read with minimization disabled, its contents might be inconsistent when the same file is read with minimization enabled at later point (and vice versa).
  2. In-memory cache entries are indexed by filename. This is problematic for symlinks, where the contents of the symlink might be inconsistent with contents of the original file (for the same reason as in problem 1).

This patch ensures consistency by always stating/reading a file exactly once. The original contents are always cached and minimized contents are derived from that on demand. The cache entries are now indexed by their UniqueID ensuring consistency for symlinks too. Moreover, the stat/read syscalls are now issued outside of critical section.

Depends on D115935.

Diff Detail

Event Timeline

jansvoboda11 created this revision.Dec 2 2021, 9:29 AM
jansvoboda11 requested review of this revision.Dec 2 2021, 9:29 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 2 2021, 9:29 AM

Add unit test, IWYU.

jansvoboda11 edited the summary of this revision. (Show Details)Dec 2 2021, 10:34 AM
jansvoboda11 edited the summary of this revision. (Show Details)Dec 2 2021, 10:45 AM
jansvoboda11 edited the summary of this revision. (Show Details)Dec 2 2021, 11:54 AM
jansvoboda11 edited the summary of this revision. (Show Details)Dec 2 2021, 11:57 AM
jansvoboda11 edited the summary of this revision. (Show Details)

Thanks for working on this; seems like a great start. At a high-level:

  • We should check overhead. It'd be good to benchmark scanning LLVM with clang-scan-deps before and after this change.
  • The locking is getting harder to track, since the acquisition and release are disconnected. I'd rather use a pattern that kept this simple.
  • Identified a few pre-existing issues that might be exacerbated by (and/or harder to fix after) this refactor.
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
38

I think this should be a StringRef (or MemoryBufferRef, which you can construct from two StringRefs, but given that the name is already in the Status object probably not useful).

68–69

Doesn't seem to be a good reason to save a null string. Just use a StringRef().

68–70

I find these two typedefs a bit obfuscating. I see that they might provide some benefit in the patch as-is because of the imposed requirement that returned result uses a pointer to a SmallString<1>; as such it's important that the type be identical.

  • Instead, it should use a StringRef to avoid depending on storage (already commented above).
  • Even if not for that, it could/should use a SmallVectorImpl<char> to avoid imposing a specific requirement on the small size.

Then Contents and OriginalContents can be skipped (the latter becoming std::unique_ptr<llvm::MemoryBuffer>, but without the obfuscation of a typedef).

73

This should be the std::unique_ptr<MemoryBuffer> from disk. There's no reason to memcpy it into a new allocation.

78

name/field match is a bit confusing. I'm not sure the typedef is buying much here.

131–132

You don't need the heavyweight std::map for reference validation. You can just use a DenseMap<KeyT, std::unique_ptr<ValueT>>. That's pretty expensive due to allocation traffic, but it's still cheaper than a std::map.

But you can also avoid the allocation traffic by using a BumpPtrAllocator, the same pattern as the StringMap above. E.g.:

llvm::SpecificBumpPtrAllocator<OriginalContents> OriginalContentsAlloc;
llvm::DenseMap<llvm::sys::fs::UniqueID, OriginalContents *> OriginalContentsCache;

// insert into shard:
OriginalContents &getOriginalContentContainer(...) {
  std::scoped_lock<std::mutex> L(CacheMutex);
  OriginalContents *&OC = OriginalContents[UID];
  if (!OC)
    OC = new (OriginalContentsAlloc) OriginalContents;
  return *OC;
}

// get original content:
StringRef getOriginalContentBuffer(...) {
  OriginalContents &OC = getOriginalContentContainer(...);
  if (OC.IsInitialized)
    return OC->Content->getBuffer();

  // Could put this after the lock I guess...
  std::unique_ptr<MemoryBuffer> Content = readFile(...);

  // check IsInitialized again after locking in case there's a race
  std::scoped_lock<std::mutex> L(SharedStat.Mutex);
  if (OC->IsInitialized)
    return OC->Content->getBuffer();

  OC->Content = std::move(Content);
  OC->IsInitialized = true;
  return OC->Content->getBuffer();
}

Same pattern for minimized content cache. Since the locks are only held briefly there's no need to pass them around and lose clarity about how long it's open. Also, IIRC, std::unique_lock is more expensive than std::scoped_lock (but my memory could be faulty).

133–135

I wonder if these should really be separate.

Seems better to have something like:

struct SharedContent {
  // Even better: unique_atomic_ptr<MemoryBuffer>, to enable lock-free access/updates.
  atomic<bool> HasOriginal;
  std::unique_ptr<MemoryBuffer> Original; 

  // Even better: std::atomic<MinimizedContent *>, with the latter bumpptrallocated, to
  // enable lock-free access/updates.
  atomic<bool> HasMinimized;
  SmallString<0> Minimized; // would be nice to bumpptrallocate this string...
  PreprocessorSkippedRangeMapping PPSkippedRangeMapping;
};
SpecificBumpPtrAllocator<SharedCachedContent> ContentAlloc;
DenseMap<llvm::sys::fs::UniqueID, SharedCachedContent *> ContentCache;

With that in place, seems like the SharedStat can have std::atomic<SharedContent *>, which caches the result of the UID lookup. This way the UID DenseMap lookup is once per stat name, saving reducing contention on the per-shard lock.

Then in the local cache, the only map storage would be:

llvm::StringMap<SharedStat *, llvm::BumpPtrAllocator> LocalStatCache;

No need to duplicate the UID-keyed caches, since the lookups there would set the pointer for the SharedContent.

clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
25–31

Is there a potential (already existing) race condition here? Can't the file change between the stat and opening the buffer?

Seems like either:

  • The Stat should be updated to have the observed size of the buffer.
  • An error should be returned if the size doesn't match.
  • The stat and/or read should be retried until they match.
36–38

This will introduce a memory regression in the common case where there are no PCHs.

  • Previously, only minimized files were saved in memory. These are relatively small, so probably no big deal.
  • Now, the original file is being saved as well. These are not small.

Instead, the MemoryBuffer should be saved directly.

  • For large files whose size isn't on a page boundary, this will be an mmap. This doesn't count against process memory because the kernel can optimize this easily, such as by sharing between processes (e.g., with actual compilation).
  • For large files on page boundaries, there was already a memcpy done in order to make this null-terminated. No reason to do that again here.
  • For small files, this is already a buffer on the heap... the extra memcpy and allocation probably doesn't matter all that much, but the large file case is worth optimizing for.

This wasteful was already around when files weren't being minimized files, but it's going to use a lot more memory now that original files are stored even when they're going to be minimized.

249

I don't love the lack of clarity between when the lock is taken and when it's released caused by this being an out parameter. I don't have a specific suggestion, but maybe there's another way to factor the code overall?

253–255

Calling readFile() behind a lock doesn't seem great. I did confirm that the original code seems to do the same thing (lock outside of createFilesystemEntry), but this refactor seems to bake the pattern into a few more places.

When races aren't very likely it's usually cheaper to:

  • lock to check cache, returning cached result if so
  • without a lock, compute result
  • lock to set cache, but if the cache has been filled in the meantime by another thread, return that and throw out the just-computed one

Maybe it'd be useful to add:

std::atomic<bool> IsInitialized;

to the MinimizedContents and OriginalContents structures stored in the shared cache. This could make it easier to decouple insertion in the shared cache from initialization. I.e., it'd be safe to release the lock while doing work; another thread won't think the default-constructed contents are correct.

jansvoboda11 marked 2 inline comments as done.Dec 6 2021, 9:21 AM

Thanks for working on this; seems like a great start.

Thanks a lot for the extensive feedback! I'll work through it and create prep patches where sensible. It seems like things can be simplified quite a bit.

At a high-level:

  • We should check overhead. It'd be good to benchmark scanning LLVM with clang-scan-deps before and after this change.

In my testing, this patch causes ~20% increase in memory usage.

  • The locking is getting harder to track, since the acquisition and release are disconnected. I'd rather use a pattern that kept this simple.

I agree. I think I'll explore the direction you suggested in one of your inline comments.

  • Identified a few pre-existing issues that might be exacerbated by (and/or harder to fix after) this refactor.

That makes sense.

clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
73

Fixed in new prep-patch: D115043.

131–132

I didn't think of using SpecificBumpPtrAllocator this way, seems really neat, thanks for the suggestion!

131–132

Yeah, scoped_lock should be cheaper, I'll create a prep-patch for that.

133–135

I really like idea of keeping a pointer to SharedContent in SharedStat and avoiding locking & lookup in the content caches. Merging original and minimized contents would probably simplify things quite a bit as well.

clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
249

Fair point, I'll try to simplify this.

253–255

Could you expand on this a bit more? If we have a lock for each file, how is locking, reading, unlocking slower than locking, unlocking, reading, locking, unlocking?

dexonsmith added inline comments.Dec 6 2021, 6:12 PM
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
253–255

You're right; if there's a lock per-file and all consumers want the result of all computations there's no benefit to releasing the lock quickly.

  • If some consumers only want partial results (or already-computed results), can be faster to release quickly.
  • Could be expensive to have mutexes per-file, since that's A LOT of mutexes. It might be cheaper in aggregate to switch to lock-free here.
jansvoboda11 marked 5 inline comments as done.

Rebase on top of D115346, apply suggested changes to the cache structure and allocation strategy.

jansvoboda11 marked 3 inline comments as done.Dec 15 2021, 10:19 AM
jansvoboda11 added inline comments.
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
25–31

I think that's right. I left a detailed FIXME in the code calling read() and would like to tackle that in a follow up. Would that be fine?

jansvoboda11 retitled this revision from [clang][deps] Split filesystem caches to [clang][deps] Split stat and file contents caches.Dec 15 2021, 10:19 AM
jansvoboda11 retitled this revision from [clang][deps] Split stat and file contents caches to [clang][deps] Split stat and file content caches.
dexonsmith added inline comments.Dec 15 2021, 3:03 PM
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
25–31

Yup, doing it in a separate commit makes sense. I suggest taking the first option, since it's the simplest.

dexonsmith added inline comments.Dec 15 2021, 4:00 PM
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
260–270

I'm not quite following this logic. I think it's safe (and important!) to modify MaybeStat if read() fails.

We're in a critical section that either creates and partially initializes the entry, or incrementally updates it.

In the "creates and partially initializes" case:

  • All other workers will get nullptr for Cache.getEntry() and try to enter the critical section.
  • We have just seen a successful MaybeStat value.
  • needsRead() will be true since we have not read contents before. We will immediately try to read.
  • read() should open the original contents and can safely:
    • on success, update the value for MaybeStat to match the observed size
    • on failure, drop the value and set the error for MaybeStat to the observed error
  • When we leave the critical section, either:
    • MaybeStat stores an error; no thread will enter the critical section again
    • OriginalContents are initialized and needsRead() returns false

In the "incrementally updates" case:

  • needsRead() returns false so read() will not be called
271

Seems like the existing stat value should be passed into read() and the second stat there removed.

dexonsmith added inline comments.Dec 15 2021, 8:07 PM
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
105–106

This name is a bit misleading... looks more like getOrCreateFileContents() to me.

260–270

The key property being the last bullet of the first case: that the "create and partially initializes" case guarantees that either MaybeStat stores an error (isReadable() is false) or OriginalContents is initialized (needsRead() returns false).


I think I've found the part I was missing: that this critical section is for a SharedCacheFileEntry (associated with a filename), but the OriginalContents is a field on a CachedFileContents which could apply to other UIDs (and SharedCache.getFileContents() is actually "get or create"). Since this commit creates the UID map, seems like maybe the race gets worse in this commit? (Not sure)


Another problem: the UID can change between the first stat above (call to getUnderlyingFS().status(Filename)) and the one inside read() if another process is writing at the same time. We can't trust the UID mapping from the first status() call unless content already exists for that UID.

I think to avoid this race you need to delay creating "UID to content" map entry until there is the result of a successful read() to store.

I'll describe an algorithm that I think is fairly clean that handles this. I'm using different data structure names to avoid confusion since I've broken it down a little differently:

  • ReadResult: stat (for directories) OR error and uid (failed read) OR stat and content (and optional minimized content and pp ranges, and a way to update them atomically))
  • FilenameMap: map from Filename to ReadResult* (shared and sharded; mirrored locally in each worker)
  • UIDMap: map from UID to ReadResult* (shared and sharded; probably no local mirror)

And here's the algorithm:

// Top-level API: get the entry/result for some filename.
ErrorOr<ReadResult &> getOrCreateResult(StringRef Filename) {
  if (ReadResult *Result = lookupEntryForFilename(Filename))
    return minimizeIfNecessary(*Result, ShouldMinimize);
  if (ErrorOr<ReadResult &> Result = computeAndStoreResult(Filename))
    return minimizeIfNecessary(*Result, ShouldMinimize);
  else
    return Result.getError();
}
// Compute and store an entry/result for some filename. Returned result
// has in-sync stat+read info (assuming read was successful).
ErrorOr<ReadResult &> computeAndStoreResult(StringRef Filename) {
  ErrorOr<Status> Stat = UnderlyingFS->status(Filename);
  if (!Stat)
    return Stat.getError(); // Can't cache missing files.
  if (ReadResult *Result = lookupEntryForUID(Stat->UID))
    return storeFilenameEntry(Filename, *Result); // UID already known.
  // UID not known. Compute a ReadResult.
  //
  // Unless this is a directory (where we don't need to go back to the FS),
  // ignore existing 'Stat' because without an open file descriptor the UID
  // could change.
  Optional<ReadResult> Result;
  if (Stat->isDirectory())
    Result = ReadResult(*Stat);
  else if (ErrorOr<ReadResult> MaybeResult = computeReadResult(Filename))
    Result = std::move(*MaybeResult);
  else
    return MaybeResult.getError(); // File disappeared...
  // Store the result. Cascade through UID then Filename. Each level could
  // return a different result than it was passed in.
  return storeEntryForFilenameOrReturnExisting(Filename,
             storeEntryForUIDOrReturnExisting(std::move(*Result));
}
// Lookup existing result in FilenameMap. No mutation. First checks local map
// then falls back to the shared map (locks shard, lookup, unlocks, saves in
// local map, returns).
ReadResult *lookupEntryForFilename(StringRef Filename);
// Lookup existing result in UIDMap. No mutation. No local map, just a shared
// map (lockshard+lookup+return).
ReadResult *lookupEntryForUID(UniqueID);
// Compute read result using a single file descriptor.
// - Return error if `open()` fails. Can't cache missing files.
// - Else compute ReadResult: Stat open file descriptor and get a memory buffer from it.
// Note: "Error" state if stat fails.
// Note: "Error" state if stat succeeds and memory buffer does not open.
// Note: if the memory buffer opens successfully, status updated with observed size.
// Note: does not take a UID parameter since live FS could have changed.
// Note: does not access or mutate UIDMap/FilenameMap/etc.
ErrorOr<ReadResult> computeReadResult(StringRef Filename);
// Compare-exchange. Pulls UID out of NewResult. Locks shard for UIDMap[UID]; checks for
// existing result; if none, bump-ptr-allocates and stores NewResult; returns stored
// result.
ReadResult& storeEntryForUIDOrReturnExisting(ReadResult &&NewResult);
// Compare-exchange. Locks shard for FilenameMap[Filename]; checks for existing result;
// if none, stores parameter; unlocks; updates local map with stored result and returns
// it.
ReadResult& storeEntryForFilenameOrReturnExisting(StringRef Filename, ReadResult &);
// If needed and missing, adds minimization info atomically. Note that Result
// may store a cached read error, or a directory.
ReadResult& minimizeIfNecessary(ReadResult& Result, bool ShouldMinimize);

The only thing "lost" is that two workers might both compute a ReadResult for the same file (the slower one having the work dropped on the floor). I'm skeptical this will matter in practice. If some measurement says it does, the FilenameMap could map from Filename to unique_ptr<pair<mutex,atomic<ReadResult*>>> and computeAndStoreResult() could take a lock at the start and re-check the map... but IMO it's better to make this simple and optimize for the common case.

This does leave behind an extended critical section in minimizeIfNecessary() to avoid racing to minimize, implying there's a mutex in ReadResult for updating the MinimizedContents and PPRanges. (I suspect minimization is fast enough (and racing workers rare and cheap enough) that the memory overhead of a per-ReadResult mutex is a bad tradeoff (a simple alternative would be to computeMinimized before the critical section, then take out a "store-minimization" lock just for setting the values (mutex could be sharded by the UID % 64 or something), but I'm less confident.)

This also leaves behind the double-stat behaviour (before and after open) for new files. New files should be pretty rare in the depscanner so maybe this is fine; I've also observed cases where failed open is slower than failed stat, so for the common case of missing files (which don't get cached...) this might be best.

jansvoboda11 added inline comments.Dec 16 2021, 8:45 AM
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
260–270

This patch (and D115346) were motivated by D114971, which prevents minimization of symlinks that point to files referenced by precompiled dependencies (e.g. a PCH).

When the dependency scanning worker disables minimization of a file referenced by a precompiled dependency, my idea was to immediately "canonicalize" the filename to UniqueIDs (through stat) and use that when deciding whether to minimize its contents in getOrCreateFileSystemEntry. With that approach, the "filename -> UniqueID" map/cache acts as the authority for stat information. However, I can see how this breaks when the FS is volatile. Besides the issue outlined in your last comment, the current approach in D114971 prolongs the pause between initial stat (when disabling file minimization - "configure time") and read ("query time"), increasing the chances for observing filesystem volatility.

I think your approach makes a lot of sense if we want to be really defensive against volatile FS. Making sure we don't have to re-stat or read files that were already stat-ed during "configuration" or previous "query" would be nice. I think that unfortunately means we need to actually read all input files of precompiled dependencies in D114971. Just stat-ing such files is no longer an option, since that would get us back to square one if we need to later read them (they might've changed).

I have implemented your idea locally, and will update this patch tomorrow.

Implement new version that ensures the stat and contents of a file are always in sync.

jansvoboda11 added inline comments.Dec 17 2021, 8:01 AM
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
324

I'm not sure these should be separate. We could end up in situation where the Filename map contains different entry than the UID map for the same directory entry. I'm tempted to merge these functions into one and perform the updates in a single critical section...

dexonsmith requested changes to this revision.Dec 20 2021, 10:31 AM

I'm liking the new direction here; requesting changes since it looks like the filename is being used to pick the shard for UIDMap, which will lead to multiple opinions of what each UID means.

clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
249

I'd use const& to avoid copying the string on the way in... see below.

251

I think, rather than move/copy the status name, the name should be wiped out to ensure no one relies on it. Every access should use copyWithNewName() since this is shared across all things that point to the same UID... so let's use copyWithNewName() here to drop the ignored name.

276–277

This doesn't look right to me. UIDs should be sharded independently of the filename they happen to have been reached by; otherwise each filename shard is developing its own idea of what each UID means. Since UID distribution is not uniform, probably the UID shard should be chosen by hash_value(Stat.getUniqueID()) % NumShards.

You could use the same sets of shards for UIDMap and FilenameMap, but since they're independent I'd probably do:

  • UIDCache: sharded by UID: UIDMap and BumpPtrAllocator for entries (and likely anything else tied to content)
  • FilenameCache: sharded by filename: FilenameMap (and perhaps other things tied to filename?)
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
23–24

In what circumstances should this return a cached-error TentativeEntry? Any?

24–25

Since the file was opened, should we return cached-error TentativeEntry here, rather than an error?

32–33

After a successful stat on the same file descriptor, it definitely feels like this is an error that should be cached, and a TentativeEntry that is in an error state should be returned.

324

I'm not sure these should be separate. We could end up in situation where the Filename map contains different entry than the UID map for the same directory entry.

I'm also sure precisely what you mean by "for the same directory entry" in this context; and I don't see what's wrong with the situation I think you're outlining.

I'm tempted to merge these functions into one and perform the updates in a single critical section...

A single critical section for setting UID and filename at the same time would be hard to get right (and efficient), since UIDs have aliases through other filenames due to different directory paths (dir/../x.h vs x.h) and filesystem links (hard and symbolic).

Here's the race that I think(?) you're worried about:

  • Worker1 does a tentative stat of "x.h", finds a UID that isn't mapped (UIDX1, but it's ignored...).
  • Worker2 does a tentative stat of "x.h", finds a UID that isn't mapped (UIDX1, but it's ignored...).
  • Worker1 opens "x.h", finds ContentX1+StatX1 (with UIDX1), saves mapping UIDX1 -> ContentX1+StatX1.
  • "x.h" changes.
  • Worker2 opens "x.h", finds ContentX2+StatX2 (with UIDX2), saves mapping UIDX2 -> ContentX2+StatX2.
  • Worker2 saves mapping "x.h" -> ContentX2+StatX2.
  • Both workers move forward with "x.h" -> ContentX2+StatX2.

IIUC, you're concerned that the mapping UIDX1 -> ContentX1+StatX1 was saved. The side effect is that if a future tentative stat of (e.g.) "y.h" returns UIDX1, then "y.h" will be mapped to ContentX1+StatX1. Is this what concerns you? Why? (Is it something else?)

The concern I have is that some filesystems recycle UIDs (maybe "x.h" *was* a symbolic link to "y.h" and then became its own file... or maybe "x.h" and "y.h" were hard links... or maybe "y.h" is just a new file!). But that's a problem with using UIDs to detect equivalent filesystem links / content in general. I don't see any reason to be more concerned here than elsewhere, and to avoid depending on UID we'd need a pretty different design (e.g., lazily detect and model directory structure and symbolic links).

This revision now requires changes to proceed.Dec 20 2021, 10:31 AM
jansvoboda11 added inline comments.Dec 21 2021, 7:55 AM
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
276–277

Hmm, skimming through RealFileSystem::status, I saw that it's calling sys::fs::status with "follow symlinks" enabled. It made sense to me that the name stored in llvm::vfs::Status would match that and refer the fully resolved target entry, not the symlink itself. Seeing as this is not the case, I agree the UID itself should be used for choosing the shard.

clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
23–24

I don't think the distinction matters at this level. Whether failures should be cached is a decision that's being made one level up.

I personally prefer having TentativeEntry to be "non-fallible" and explicitly wrapping the whole thing in ErrorOr. That makes it easier for others to know what they are working with (i.e. this object cannot represent an error state). Eventually, I think this would make sense for the caches and CachedFileSystemEntry too.

24–25

Let's return an error here and create error CachedFileSystemEntry one level up.

324

Yes, that's the kind of scenario I was thinking about. I'm not concerned about consequences of that side effect, I just don't like storing garbage that will most likely never be used/referenced again and might be confusing during debugging.

I agree with you on UID recycling...

Erase filenames in temporary Stat objects, use UniqueID as shard key where appropriate.

dexonsmith requested changes to this revision.Jan 12 2022, 6:03 PM

Okay, I think this is the last round. Everything looks correct, except a few out-of-date comments.

Two things, one small, one bigger (but not too big I think).

Smaller one is that there's an unnecessary string copy when getting the status.

Bigger one is that I think CacheShard::ContentsCache can be deleted, since the map is fully redundant with CacheShard::EntriesByUID (most of the inline comments are about that). This is because they are both indexed by UID and created at the same time (two immediately adjacent locks of the same shard mutex with no computation in between). Relatedly, CachedFileSystemEntry::ContentsAccess is never modified after construction. Seems reasonable to keep a separate allocation for the CachedFileContent since it's fairly big and directories (and cached errors) don't need them, but the CachedFileSystemEntry can just point at a raw pointer whose value is never expected to change and we don't need the separate map.

clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
27–28

This should mention that it's shared across different filenames, only one per UID now, and the filename is empty in the stat.

79–82

This comment is out of date, since there's a 1:1 correspondence between CachedFileSystemEntry and CachedFileContents since they're both looked up by UID.

Also, ContentsAccess is never modified after construction so it can just be a raw pointer.

Outlining the allocation of CachedFileContents might still make sense, since it's big and stat failures (with no content) are probably the common case due to header search patterns... only if we actually create these for stat failures though. Might be worth a comment saying why it's outlined. Maybe it should even be delayed to a follow-up commit to simplify this patch, since now that CachedFileSystemEntry is per-UID it doesn't seem to be required here... but the fields probably need to be made mutable regardless somehow so it doesn't seem like a ton of churn.

122–123

I think this map can be deleted, since it's not actually used to deduplicate/share anything that EntriesByUID doesn't handle.

144–145

I think there's a new/unnecessary std::string copy in the case where copyWithNewSize() happens. If the size were fixed first that could be avoided:

auto Stat = Entry.getStatus();
if (!Stat.isDirectory())
  Stat = copyWithNewSize(...);
return copyWithNewName(Stat, Filename);
155–160

I think this can be removed / merged with getOrEmplaceEntryForUID(), based on how it's used.

clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
143–153

Looks like the two UID maps are always filled directly after each other. Seems like we can reduce lookups like this.

163–166

I think this can be merged into getOrEmplaceEntryForUID.

207–239

I think this can be one call since they're taking the same lock and always done one after the other.

This revision now requires changes to proceed.Jan 12 2022, 6:03 PM
jansvoboda11 marked 7 inline comments as done.

Update comments, remove std::atomic<>, merge getOrEmplaceContentsForUID into getOrEmplaceEntryForUID.

clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
79–82

I've updated the comment and changed ContentsAccess to a raw pointer.

I agree outlining CachedFileContents in a follow-up patch would be cleaner, but I don't have much time to spare on prettifying git history at this moment unfortunately.

122–123

Yeah, that's right. Removed this in the latest revision.

This revision is now accepted and ready to land.Jan 19 2022, 1:29 PM
jansvoboda11 retitled this revision from [clang][deps] Split stat and file content caches to [clang][deps] Ensure filesystem cache consistency.Jan 21 2022, 4:02 AM
jansvoboda11 edited the summary of this revision. (Show Details)
This revision was landed with ongoing or failed builds.Jan 21 2022, 4:04 AM
This revision was automatically updated to reflect the committed changes.