This is an archive of the discontinued LLVM Phabricator instance.

Module: use PCMCache to manage memory buffers for pcm files.
ClosedPublic

Authored by dexonsmith on Jan 4 2017, 9:58 AM.

Details

Summary

We create a PCMCache class to manage memory buffers associated with pcm files. With implicit modules, we currently use lock files to make
sure we are making progress when a single clang invocation builds a module file and then loads the module file immediately after. This
implementation requires lock file for correctness and causes all kinds of locking issues.

This patch tries to use PCMCache to share the content between writes and reads within a single clang invocation, so we can remove the
correctness-side of the lock file, and only use lock file for performance i.e to reduce chance of building the same pcm file from multiple
clang invocations.

We also add ThreadContext in PCMCache to diagnose cases where an ancestor validates a module and later on, a child thread tries
to invalidate it. Without the patch, this scenario will cause use-after-free of the FileEntry held by the ancestor.

Diff Detail

Event Timeline

manmanren updated this revision to Diff 83069.Jan 4 2017, 9:58 AM
manmanren retitled this revision from to Module: use PCMCache to manage memory buffers for pcm files..
manmanren updated this object.
manmanren added subscribers: cfe-commits, bruno.
aprantl added a subscriber: aprantl.Jan 4 2017, 1:22 PM
aprantl added inline comments.
include/clang/Basic/DiagnosticSerializationKinds.td
131 ↗(On Diff #83069)

Is this better?

"module file '%0' was validated as a system module and is now being imported as a non-system module; any differences in diagnostics options will be ignored"

135 ↗(On Diff #83069)

I wonder if there is a way to rephrase this message such that an end-user could understan how to interpret this error?

benlangmuir edited edge metadata.Jan 4 2017, 3:56 PM

Can we test the already-validated diagnostics?

include/clang/Basic/FileManager.h
176 ↗(On Diff #83069)

Why is this inside the FileManager? It isn't used by the FileManager.

308 ↗(On Diff #83069)

Can we call this something like "ModuleLoadContext" or maybe "ModuleCompilationContext"? The word thread is misleading, since the threads are not run concurrently, and are only an implementation detail of our crash recovery.

Also, can you explain why it is only necessary to keep information about the the current stack of contexts? In a situation like

A imports B imports C
A imports D imports C

We would no longer have information about C being validated, right? What happens if there's a mismatch there? Can this never happen?

I forgot to upload the testing case, I will add it now.

Manman

include/clang/Basic/FileManager.h
176 ↗(On Diff #83069)

I initially had another patch that puts PCMCache object at the top level (i.e whenever we create a FileManager, we create a PCMCache object). The initial patch is much bigger than this patch. PCMCache is cacheing the content of a PCM File, and is shared across threads that we spawned to build modules implicitly, just like FileManager. I had some discussions with Bruno and Adrian, we felt FileManager is a better place. But if you have other suggestions, please let me know!

308 ↗(On Diff #83069)

Sure, we can call it ModuleCompilationContext because we are spawning threads to build modules implicitly.

The issue we are trying to detect is use-after-free, i.e we don't want a thread to hold on to an invalid FileEntry. We error out before trying to delete the FileEntry that is live in another thread.

In a situation like
A imports B imports C
A imports D imports C

A will start a thread to build B, and B will start a thread to build C, C will be validated in B's compilation thread and A's compilation thread. Later on, if D tries to invalidate C, D knows that an ancestor thread A has validated C, so the compiler will error out. And it is okay for A to invalidate C in A's compilation thread, because we can clean up the references to C's FileEntry.

manmanren updated this revision to Diff 83273.Jan 5 2017, 10:47 AM
manmanren edited edge metadata.

Add testing case.

manmanren added inline comments.Jan 5 2017, 10:54 AM
include/clang/Basic/DiagnosticSerializationKinds.td
131 ↗(On Diff #83069)

Thanks Adrian. Your wording sounds better.

135 ↗(On Diff #83069)

I am not sure if we will emit this error diagnostic in some case, that is why I don't have a testing case. Maybe it is better to throw a hard error instead of a diagnostic that user can't understand?

benlangmuir added inline comments.Jan 10 2017, 4:11 PM
include/clang/Basic/FileManager.h
176 ↗(On Diff #83069)

I see. It's not ideal, but I see why it would be convenient to use the FileManager this way, since you want to share it in all the same places the FileManager is shared. Since I don't have any better answer, I guess I'm okay with this...

308 ↗(On Diff #83069)

Okay

manmanren updated this revision to Diff 84137.Jan 12 2017, 9:58 AM

Addressing review comments.

manmanren marked 2 inline comments as done.Jan 12 2017, 9:59 AM

Ping :]
Hoping to wrap this up this week.
Cheers,
Manman

benlangmuir accepted this revision.Jan 20 2017, 9:49 AM

Sorry for the delay, I though you were still iterating with @aprantl for some reason. LGTM

This revision is now accepted and ready to land.Jan 20 2017, 9:49 AM
bruno added inline comments.Jan 24 2017, 5:56 PM
lib/Serialization/ASTReader.cpp
3692 ↗(On Diff #84137)

This should honor ARR_OutOfDate, so that the module can be rebuilt in case a user module mismatch via a different diagnostics (lib/Serialization/ASTReader.cpp:4076) and returns OutOfDate instead of Success:

if ((ClientLoadCapabilities & ARR_OutOfDate) == 0)
  for (auto N : ValidationConflicts)
    Diag(diag::err_module_ancestor_conflict) << N;
dexonsmith commandeered this revision.Feb 6 2017, 10:37 AM
dexonsmith edited reviewers, added: manmanren; removed: dexonsmith.
dexonsmith edited edge metadata.

Taking this over to rebase it.

dexonsmith updated this revision to Diff 87257.Feb 6 2017, 10:47 AM

Rebased on top of the current https://reviews.llvm.org/D27689. This also has some substantive changes:

  • Renamed the data structure PCMCache to MemoryBufferCache, and split it out into its own files with tests.
  • Simplified logic around whether a MemoryBuffer has already been used. Tracking of what has already been validated now takes O(1) memory instead of O(n^2).
  • Moved FileManager::BufferMgr (the instance of PCMCache) to CompilerInstance::PCMCache and ModuleManager::PCMCache (instances of IntrusiveRefCntPtr<MemoryBufferCache>). This took some work to thread through, but since the PCMCache isn't related to the FileManager, this seemed like a cleaner result.
  • Added some testcases for bugs we've found in the meantime (thanks to Adrian Prantl). Primarily, we no longer error or cause a use-after-free if a MemoryBuffer was validated by an ancestor.
dexonsmith requested review of this revision.Feb 6 2017, 11:10 AM
dexonsmith edited edge metadata.

This has changed enough it needs another review. Richard or Ben, could you take a(nother) look?

dexonsmith updated this revision to Diff 91799.Mar 14 2017, 5:16 PM

Rebased again, and cleaned up tests, and a ping... now that dependencies are resolved.

Richard and/or Ben, can you take a look?

! In D28299#701194, @dexonsmith wrote:
Rebased again, and cleaned up tests, and a ping... now that dependencies are resolved.

Richard and/or Ben, can you take a look?

(Or anyone else that feels comfortable confirming that the ownership semantics of MemoryBufferCache are correct.)

Drive-by comment.

clang/lib/Basic/MemoryBufferCache.cpp
18

we should be able to replace the std::make_pair() with {}.

benlangmuir accepted this revision.Mar 17 2017, 9:58 AM
This revision is now accepted and ready to land.Mar 17 2017, 9:58 AM
dexonsmith closed this revision.Mar 17 2017, 4:08 PM
dexonsmith marked an inline comment as done.

Thanks for the reviews! Committed in r298165.