This is an archive of the discontinued LLVM Phabricator instance.

[clangd] SIGSEGV at clangd: DiagnosticConsumer Is Used After Free
ClosedPublic

Authored by ivanmurashko on Sep 1 2023, 8:48 AM.

Details

Summary

This is a follow-up patch for D148088. The dynamic symbol index (FileIndex::updatePreamble) may run in a separate thread, and the DiagnosticConsumer that is set up in buildPreamble might go out of scope before it is used. This could result in a SIGSEGV when attempting to call any method of the DiagnosticConsumer class.

The function buildPreamble sets up the DiagnosticConsumer as follows:

... buildPreamble(...) {
...
  StoreDiags PreambleDiagnostics;
  ...
  llvm::IntrusiveRefCntPtr<DiagnosticsEngine> PreambleDiagsEngine =
    CompilerInstance::createDiagnostics(&CI.getDiagnosticOpts(),
                                        &PreambleDiagnostics,
                                        /*ShouldOwnClient=*/false);
  ...
  // The call might use the diagnostic consumer in a separate thread
  PreambleCallback(...)
  ...
}

PreambleDiagnostics might be out of scope for buildPreamble function when we call it inside PreambleCallback in a separate thread.

The Fix
The fix involves replacing the client (DiagnosticConsumer) with an IgnoringDiagConsumer instance, which will print messages to the clangd log.

Alternatively, we can replace PreambleDiagnostics with an object that is owned by DiagnosticsEngine.

Note
There is no corresponding LIT/GTest for this issue, since there is a specific race condition that is difficult to reproduce within a test framework.

Test Plan:

ninja check-clangd

Diff Detail

Event Timeline

ivanmurashko created this revision.Sep 1 2023, 8:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 1 2023, 8:48 AM
Herald added a subscriber: arphaman. · View Herald Transcript
ivanmurashko requested review of this revision.Sep 1 2023, 8:48 AM
ivanmurashko added inline comments.Sep 1 2023, 8:54 AM
clang-tools-extra/clangd/Preamble.cpp
610–619

Alternative fix

sammccall accepted this revision.Sep 1 2023, 8:56 AM

Nice catch!

clang-tools-extra/clangd/Preamble.cpp
709

I think this should go up next to PreambleDiagsEngine.reset() etc, as it's basically the same thing: we're trying to avoid any race between the async work done by the callback and the cleanup at the end of the function.

Also I think it's slightly clearer for us to consistently be taking the diagnostics from PreambleDiagnostics *after* it's detached from the compiler.

This revision is now accepted and ready to land.Sep 1 2023, 8:56 AM
sammccall added inline comments.Sep 1 2023, 8:57 AM
clang-tools-extra/clangd/Preamble.cpp
610–619

I prefer the version in the patch: we keep PreambleDiagnostics auto-scoped and can more easily reason about its lifetime.

ivanmurashko marked an inline comment as done.

@sammccall's comment was addressed

clang-tools-extra/clangd/Preamble.cpp
709

I applied the change

ivanmurashko marked an inline comment as done.Sep 2 2023, 12:29 PM
ivanmurashko marked an inline comment as not done.

thanks, the fix LGTM as well.

but i wonder how this surfaces, to make sure we're taking necessary precautions in the future. we definitely have a dangling reference, which isn't great. but it's surprising that we access diags consumer during indexing.
I assume it's about the modules setup you're running clangd in. Do you have any stack traces that shows the execution path? my assumption is, this triggers when clangd ends up deserializing some symbols from a module. If these end up being important diagnostics, we might want to figure out how to emit diagnostics from these stages as well.

This comment was removed by kadircet.
kadircet accepted this revision.Sep 4 2023, 12:30 AM

thanks, the fix LGTM as well.

but i wonder how this surfaces, to make sure we're taking necessary precautions in the future. we definitely have a dangling reference, which isn't great. but it's surprising that we access diags consumer during indexing.
I assume it's about the modules setup you're running clangd in. Do you have any stack traces that shows the execution path? my assumption is, this triggers when clangd ends up deserializing some symbols from a module. If these end up being important diagnostics, we might want to figure out how to emit diagnostics from these stages as well.

Yes, you are right. The diags consumer is triggered when it tries to read an implicit module that has some incompatibilities with the preamble headers.

The typical stack trace is below (that is LLVM-12 specific)

clang::DiagnosticIDs::ProcessDiag(clang::DiagnosticsEngine&)
clang::DiagnosticsEngine::EmitCurrentDiagnostic(bool)
clang::DiagnosticBuilder::~DiagnosticBuilder()
clang::ASTReader::diagnoseOdrViolations()
clang::ASTReader::FinishedDeserializing()
clang::DeclContext::LoadLexicalDeclsFromExternalStorage()
clang::DeclContext::decls_begin()
clang::index::IndexingContext::indexDeclContext(clang::DeclContext const*)
clang::declvisitor::Base<llvm::make_const_ptr, (anonymous namespace)::IndexingDeclVisitor, bool>::Visit(clang::Decl const*)
clang::index::IndexingContext::indexDecl(clang::Decl const*)
clang::index::IndexingContext::indexDeclContext(clang::DeclContext const*)
clang::declvisitor::Base<llvm::make_const_ptr, (anonymous namespace)::IndexingDeclVisitor, bool>::Visit(clang::Decl const*)
clang::index::IndexingContext::indexDecl(clang::Decl const*)
clang::index::IndexingContext::indexDeclContext(clang::DeclContext const*)
clang::declvisitor::Base<llvm::make_const_ptr, (anonymous namespace)::IndexingDeclVisitor, bool>::Visit(clang::Decl const*)
clang::index::IndexingContext::indexDecl(clang::Decl const*)
clang::index::indexTopLevelDecls(clang::ASTContext&, clang::Preprocessor&, llvm::ArrayRef<clang::Decl const*>, clang::index::IndexDataConsumer&, clang::index::IndexingOptions)
clang::clangd::(anonymous namespace)::indexSymbols(clang::ASTContext&, std::shared_ptr<clang::Preprocessor>, llvm::ArrayRef<clang::Decl*>, clang::clangd::MainFileMacros const*, clang::clangd::CanonicalIncludes const&, bool, llvm::StringRef, bool)
clang::clangd::indexHeaderSymbols(llvm::StringRef, clang::ASTContext&, std::shared_ptr<clang::Preprocessor>, clang::clangd::CanonicalIncludes const&)
clang::clangd::FileIndex::updatePreamble(llvm::StringRef, llvm::StringRef, clang::ASTContext&, std::shared_ptr<clang::Preprocessor>, clang::clangd::CanonicalIncludes const&)
void
void
threadFuncAsync(void*)
start_thread