This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Introduce ASTHooks to FeatureModules
ClosedPublic

Authored by kadircet on Mar 12 2021, 5:56 AM.

Details

Summary

These can be invoked at different stages while building an AST to let
FeatureModules implement features on top of it. The patch also
introduces a sawDiagnostic hook, which can mutate the final clangd::Diag
while reading a clang::Diagnostic.

Diff Detail

Event Timeline

kadircet created this revision.Mar 12 2021, 5:56 AM
kadircet requested review of this revision.Mar 12 2021, 5:56 AM
kadircet updated this revision to Diff 330258.Mar 12 2021, 8:49 AM
  • Add tests

I'm getting a little nervous about the amount of stuff we're packing into modules without in-tree examples.
I should split out some of the "standard" features into modules as that's possible already.

My model for modules using this diagnostic stuff (apart from the build-system stuff which sadly can't be meaningfully upstreamed) are IncludeFixer, ClangTidy, and IWYU - worth thinking about how we'd build those on top of this model. (Doesn't mean we need to add a hook to emit diagnostics, but it probably means we should know where it would go)

clang-tools-extra/clangd/FeatureModule.h
104

naming: giving this a plural name and having a collection of them is a little confusing (is Hooks the hooks from one module, or from all of them?). What about ASTListener?
(Listener suggests passive but listeners that "participate" is a common enough idiom I think)

112

This comment hints at what I *thought* was the idea: astHooks() is called each time we parse a file, the returned object has methods called on it while the file is being parsed, and is then destroyed.

But the code suggests we call once globally and it has effectively the same lifetime as the module.
This seems much less useful, e.g. if we want to observe several diagnostics, examine the preamble, and emit new diagnostics, then we have to plumb around some notion of "AST identity" rather than just tying it to the identity of the ParseASTHooks object itself.

(Lots of natural extensions here like providing ParseInputs to astHooks(), but YAGNI for now)

Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2021, 9:57 AM
kadircet marked an inline comment as done.Mar 23 2021, 4:29 AM

My model for modules using this diagnostic stuff (apart from the build-system stuff which sadly can't be meaningfully upstreamed) are IncludeFixer, ClangTidy, and IWYU - worth thinking about how we'd build those on top of this model. (Doesn't mean we need to add a hook to emit diagnostics, but it probably means we should know where it would go)

Agreed. I believe they all would need extra end points to ASTHooks though.

ClangTidy:

  • needs extra hooks to register PP callbacks, and take in a diagnostics engine
  • needs a new endpoint to traverse ast and *emit* diags
  • CTContext needs to be alive until the parsing is done: so we can:
    • make ASTHooks own it and instantiate a new one on every parse (i think the cleanest and most explicit)
    • make the module own them and control the lifetime with entry/exit calls on the asthooks. (there's more burden on the modules now, and they'll need extra synchronisation on enter/exit calls)
    • return some other object on parsing entry hook that'll be kept alive by the caller (needs design around semantics of that object).

IncludeFixer:

  • needs a new hook to act as an external sema source, so that it can fix unresolved names.
  • similar to CTContext issue, it references per-tu state like HeaderSearchInfo, so will need to incorporate these into entry hook, and somehow disambiguate hooks for different TUs (all the options proposed for tidy).

IWYU:

  • I'd expect this to make use of ~same API as IncludeFixer.

PreamblePatching:

  • can mutate compiler instance via a hook
  • drop diagnostics from ParsedAST on exit (requires some mutation on parsedast though)
clang-tools-extra/clangd/FeatureModule.h
112

This comment hints at what I *thought* was the idea: astHooks() is called each time we parse a file, the returned object has methods called on it while the file is being parsed, and is then destroyed.

This was the intent actually (to enable modularization of other features like clangtidy, includefixer, etc. as mentioned above), but looks like i got confused while integrating this into clangdserver :/

While trying to resolve my confusion i actually noticed that we cannot uphold the contract of "hooks being called synchronously", because we actually have both a preamblethread and astworker that can invoke hooks (embarrassing of me to forget that 😓).

So we can:

  • Give up on that promise and make life slightly complicated for module implementers
  • Don't invoke hooks from preamblethread, (at the cost of limiting functionality, we can still have downstream users that act on PPCallbacks, but no direct integration with compiler, and that's where layering violations are generated :/)
  • Handle the synchronization ourselves, only complicates TUScheduler more, rather than all the module implementers.
  • Propogate FeatureModuleSet into TUScheduler and create 2 set of hooks on each thread :)

I am leaning towards 4, but unsure (mostly hesitant about dependency schema, as featuremodules also depend on TUScheduler..). WDYT?

kadircet updated this revision to Diff 332613.Mar 23 2021, 4:29 AM
  • Rename ParsedASTHooks to Listeners.
  • Generate list of hooks on each parse.

My model for modules using this diagnostic stuff (apart from the build-system stuff which sadly can't be meaningfully upstreamed) are IncludeFixer, ClangTidy, and IWYU - worth thinking about how we'd build those on top of this model. (Doesn't mean we need to add a hook to emit diagnostics, but it probably means we should know where it would go)

Agreed. I believe they all would need extra end points to ASTHooks though.

Yup. Let's not bite off more for now.

  • make ASTHooks own it and instantiate a new one on every parse (i think the cleanest and most explicit)

Agreed. (And this only one that overlaps this patch a lot)

clang-tools-extra/clangd/FeatureModule.h
112

we actually have both a preamblethread and astworker that can invoke hooks (embarrassing of me to forget that 😓)

Ha! And yes, in our motivating case of fixing build system rules, the diagnostics are mostly going to be in the preamble.

The options that seem most tempting to me are:

  • don't attempt to create/run astHooks while building preambles, but *do* feed the preamble's Diags into the main-file's AST hooks every time it's used. (you won't have a clang::Diagnostic, so that param would have to be gone/optional, and we'd scrape the messages instead of extracting args). This is kind of thematic, remember how we replay PPCallbacks for clang-tidy :-). This is the smallest tweak to your #2 that actually works for us, I think.
  • create one astHooks for the preamble, and another for the AST build, and try to make the interface suitable for both. This is cute but there may be too much tension between the two cases. (Is this what you mean by #4?)
  • or have separate ASTHooks & PreambleHooks interfaces and support all this crap for both. (Or is *this* what you mean by #4?) Hybrid is also possible, give the interfaces an inheritance relationship, or have one interface but pass boolean parameters to indicate which version we're doing, or...
  • preambles and ASTs are a tree, so... give modules a PreambleHooks factory, and the factory function for ASTHooks is PreambleHooks::astHooks(). Holy overengineering batman...

These are roughly in order of complexity so we should probably start toward the top of the list somewhere. Up to you.

(I don't like the #1 or #3 in your list above much at all.)

kadircet added inline comments.Mar 29 2021, 2:36 PM
clang-tools-extra/clangd/FeatureModule.h
112

create one astHooks for the preamble, and another for the AST build, and try to make the interface suitable for both. This is cute but there may be too much tension between the two cases. (Is this what you mean by #4?)

yes this is what i meant by #4.

These are roughly in order of complexity so we should probably start toward the top of the list somewhere. Up to you.

I'd lean towards the second option(creating separate ASTHooks with the same interface for preamble and astworker). As discussed before first one (i.e. scraping diag message) is a limited solution that's likely to get us into a corner in the future. let me know if you have any concerns around this approach.

kadircet updated this revision to Diff 334079.Mar 30 2021, 2:07 AM
  • Have 2 separate hooks for preamble and mainfile ASTs, as they are produced async.
sammccall added inline comments.Apr 8 2021, 5:36 AM
clang-tools-extra/clangd/ClangdServer.cpp
237

Now we're creating two ASTListeners for each version of the file seen, but most won't be parsed twice (and some won't even be parsed once).

This seems like it's going to be a weird wart for e.g. hooks that expect to do something interesting when a file finishes, or those that allocate something heavyweight.

I'd suggest sinking the ModuleSet into ParsedAST::build etc, or at least moving the astHooks() calls into TUScheduler. (And documenting that astHooks() should be threadsafe)

Otherwise, need to document that sometimes hooks are created and then never used.

clang-tools-extra/clangd/Compiler.h
60

I think having these in ParseInputs is conceptually confusing and shared_ptr is risky.
For example we copy inputs into ASTWorker::FileInputs, which extends the lifetime of the listeners in a way that seems undesirable.

I think these listeners should be clearly either:

  • bound to the lifetime of the ParsedAST (and therefore owned by it)
  • bound to the *creation* of the ParsedAST (and therefore strictly scoped within ParsedAST::build)

You could consider the ModuleSet to be part of ParseInputs, though...

clang-tools-extra/clangd/Diagnostics.h
150

ArrayRef<shared_ptr<ASTListener>> seems like we're leaking a lot. And even mentioning AST hooks seems dubious layering-wise.
Can we use a std::function like LevelAdjuster?

clang-tools-extra/clangd/FeatureModule.h
101

nit: this describes the interaction more but not what it's for.
Maybe

/// Extension point that allows modules to observe and modify an AST build.
/// One instance is created each time clangd produces a ParsedAST or PrecompiledPreamble.
/// For a given instance, lifecycle methods are always called on a single thread.
113

the "hooks" name is still used here and throughout

kadircet updated this revision to Diff 337101.Apr 13 2021, 4:14 AM
kadircet marked 7 inline comments as done.
  • Pass FeatureModuleSet rather than astListeners in ParseInputs.
  • Create listeners only when building ASTs.
  • Use a callback function in StoreDiags to notify about diagnostics.
sammccall accepted this revision.Apr 13 2021, 6:23 AM
sammccall added inline comments.
clang-tools-extra/clangd/FeatureModule.h
106

comment: listeners are destroyed once the AST is built.

181

this seems like a slightly weird thing to have in FeatureModule.h, I'd be tempted to inline it in the callsites instead, but up to you.

This revision is now accepted and ready to land.Apr 13 2021, 6:23 AM
kadircet updated this revision to Diff 337168.Apr 13 2021, 8:39 AM
kadircet marked 2 inline comments as done.
  • Inline helper to call-sites
  • Add comments about destruction of ast listeners.
This revision was landed with ongoing or failed builds.Apr 13 2021, 8:50 AM
This revision was automatically updated to reflect the committed changes.