This is an archive of the discontinued LLVM Phabricator instance.

[clang][deps] Squash caches for original and minimized files
ClosedPublic

Authored by jansvoboda11 on Dec 8 2021, 7:09 AM.

Details

Summary

The minimizing and caching filesystem used by the dependency scanner keeps minimized and original files in separate caches.

This setup is not well suited for dealing with files that are sometimes minimized and sometimes not. Such files are being stat-ed and read twice, which is wasteful and also means the two versions of the file can get "out of sync".

This patch squashes the two caches together. When a file is stat-ed or read, its original contents are populated. If a file needs to be minimized, we give the minimizer the already loaded contents instead of reading the file again.

Diff Detail

Event Timeline

jansvoboda11 created this revision.Dec 8 2021, 7:09 AM
jansvoboda11 requested review of this revision.Dec 8 2021, 7:09 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 8 2021, 7:09 AM

Update documentation

This looks really nice.

One problem is that it's subtle to confirm whether it's thread-safe. The local cache access parts of CachedFileSystemEntry lock-free, but the CachedFileSystemEntry might get changed later (i.e., filled in with minimized content). Are accessed fields guaranteed by context not to be the ones that mutate later?

clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
50–51

Might be more natural as a non-static:

void minimize();
69–75

I *think* this is thread-safe -- since Minimized should be the same as when the local cache entry was created -- but it's a bit subtle.

The problematic case I am worried about:

  • First use in local cache is non-minimized.
    • Creates shared cache entry that's not minimized.
  • Some other local cache wants it to be minimized.
  • Later use in local cache is minimized.
    • Accessing Minimized pointer races with the other thread setting it.

If the local cache is guaranteed to always pass the same value for Minimized as when it fist accessed it, then there shouldn't be a problem...

I wonder if there's a way to make it less subtle?

106–108

I'm finding it a bit subtle detecting if there are races on access / setting of these, but I think it's correct.

  • I think I verified that they are "set once".
  • All the setters are guarded by locks.
  • The first getter per "local" cache is guarded by a lock.
  • Subsequent getters are not.

The key question: are the subsequent getters ONLY used when the first getter was successful?

One way to make it more obvious:

  struct ContentWithPPRanges {
    std::unique_ptr<llvm::MemoryBuffer> Content;
    PreprocessorSkippedRangeMapping PPSkippedRangeMapping;
  };

private:
  // Always accessed,mutated under a lock. Not mutated after they escape.
  std::unique_ptr<llvm::MemoryBuffer> Original;
  std::unique_ptr<llvm::MemoryBuffer> Minimized;
  PreprocessorSkippedRangeMapping PPSkippedRangeMapping;

  // Used in getters. Pointed-at memory immutable after these are set.
  std::atomic<const llvm::MemoryBuffer *> OriginalAccess;
  std::atomic<const llvm::MemoryBuffer *> MinimizedAccess;
  std::atomic<const PreprocessorSkippedRangeMapping *> PPRangesAccess;

I don't think this is the only approach though.

clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
189–191

Is the DependencyScanningWorkerFilesystem guaranteed to always want either minimized or not minimized?

IOW, if the same filesystem is reused, and the first time only the original file was needed and later the minimized was needed, I don't see the code path for minimizing the file later. But maybe reusing one of these FSs is not supported?

Introduce EntryRef, handle delayed minimization.

Prevent potential data races by wrapping access to Entry.getPPSkippedRangeMapping() in if (Entry.isMinimized()).

This looks really nice.

One problem is that it's subtle to confirm whether it's thread-safe. The local cache access parts of CachedFileSystemEntry lock-free, but the CachedFileSystemEntry might get changed later (i.e., filled in with minimized content). Are accessed fields guaranteed by context not to be the ones that mutate later?

I believe the latest revision is thread-safe (albeit with benign data-races on needsMinimization). Once a CachedFileSystemEntry has been created, only the MinimizedContents and PPSkippedRangeMapping members can be modified (at most once, under a lock). Access to those fields only happens in "minimizing" contexts.

jansvoboda11 marked 3 inline comments as done.Dec 9 2021, 7:53 AM
jansvoboda11 added inline comments.
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
50–51

Agreed. I originally wanted to minimize diff of the implementation (by being able to refer to Result as before), but I'm happy making this an instance method.

69–75

I extracted the logic into EntryRef which now gets returned from the main function. It carries the bool Minimized bit and makes reasoning about this more local. WDYT?

106–108

I think there are no races on the original contents. The pointer is unconditionally set on creation of CachedFileSystemEntry under a lock that no threads get past without having set the pointer (or failing and never accessing the pointer).

For minimized contents, the latest revision adds check at the beginning of the main function (needsMinimization) outside the critical section. There are three paths I can think of:

  • The check returns true in thread A (pointer is null), thread A enters critical section, minimizes the contents and initializes the pointer.
  • The check returns true in thread A, but thread B entered the critical section, minimized contents and initialized the pointer. When thread A enters the critical section, it performs the check again, figures that out and skips minimization.
  • The check returns false and the local cache entry is returned.

So there definitely is a race here, but I believe it's benign. Do you agree? Do you think it's worth addressing?

clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
189–191

You're right, minimization of a particular file can be turned on/off on the fly. I already have a test for this in D114966 but somehow dropped it when extracting this patch. Fixed in the latest revision.

261

Maybe the isMinimized() check should be performed internally in EntryRef when accessing getPPSkippedRangeMapping() to make the race avoidance more local...

dexonsmith added inline comments.Dec 9 2021, 2:57 PM
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
106–108

I don't trust myself to evaluate whether it's benign, but if there's no atomic mutation, then I think it's possible that when the setter changes a value from "x" to "y" then the racing reader can see a third value "z". You might gain some confidence by using -fsanitize=thread on a workload that's going to include this sort of thing -- probably hard to exercise: PCH already exists, try minimizing something that uses the PCH, and then try minimizing something that doesn't.

I'd rather just avoid the race entirely so we don't need to guess though.

jansvoboda11 marked 3 inline comments as done.Dec 10 2021, 7:38 AM
jansvoboda11 added inline comments.
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
106–108

Interesting...

After reading up on this a bit, my understanding is that reads of MinimizedContents cannot be torn, because it's pointers-sized and aligned. So we should never see a third value "z". Am I wrong?

The potential data race is IMO somewhat independent from the read tearing aspect and is avoided by defensively checking MinimizedContents again under lock.

To confirm, I ran the following test case with and without thread sanitizer, never seeing data races or incorrect results.

I'm happy to use the std::atomic pattern you suggested, but I want to be sure I understand why that's necessary.

dexonsmith added inline comments.Dec 14 2021, 12:04 PM
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
106–108

Heh, I don't know what can and cannot tear (especially on different architectures/etc.), I'm just wary. I'll trust your homework, but please add a comment documenting why it's thread-safe to read without atomics/locks.

Rebase, use std::atomic to prevent data races.

jansvoboda11 marked 3 inline comments as done.Dec 15 2021, 8:52 AM
jansvoboda11 added inline comments.
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
106–108

I believe I got to the bottom of it. The pattern I'm using is called "double-checked locking" and is considered unsafe without atomics, since compiler optimizations can break it on some platforms. Adding a memory fence (std::atomic) here should make this safe and portable.

dexonsmith accepted this revision.Dec 15 2021, 3:30 PM

LGTM after you fix one more race (and add good code comments). Reads of PPSkippedRangeMappings don't happen until after MinimizedContentsAccess is checked, but the writes need to happen in reverse order to properly guard.

clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
89

I think the .load() is unnecessary due to atomic<T>::operator T() const, but up to you; arguably it helps readability here (as redundant "documentation" that it's an atomic load).

106–108

Okay, makes sense.

(IMO, storing it twice is silly. We should just have an atomic variant of unique_ptr, maybe called ThreadSafeUniquePtr, which uses std::atomic<T*> and has no customizable deleter (or requires deleter to have no storage; or adds a mutex if/when deleter needs storage). But not for this patch...)

clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
60

I think you should move this line *below* the assignment to PPSkippedRangeMapping so that no one tries to read that until it's ready. (I have a few other comments inline to point out why; please add a good comment explaining it in code too...)

Also, I think you can just write this as:

MinimizedContentsAccess = MinimizedContentsStorage.get();
76–85

(PPSkippedRangeMapping isn't ready to be read until here...)

167

(I think needsUpdate() can return false as soon as MinimizedContentsAccess is set, but PPSkippedRangeMappings won't be ready to read yet.)

This revision is now accepted and ready to land.Dec 15 2021, 3:30 PM
This revision was landed with ongoing or failed builds.Dec 16 2021, 12:57 AM
This revision was automatically updated to reflect the committed changes.
jansvoboda11 marked an inline comment as done.

Committed with the writes swapped and explanation in the comment. Thanks for the review!