This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Replace ASTUnit with manual AST management.
ClosedPublic

Authored by ilya-biryukov on Jul 14 2017, 2:38 AM.

Details

Summary

This refactoring does not aim to introduce any significant changes to
the behaviour of clangd to keep the change as simple as possible.

Diff Detail

Repository
rL LLVM

Event Timeline

ilya-biryukov created this revision.Jul 14 2017, 2:38 AM
ilya-biryukov added a project: Restricted Project.Jul 14 2017, 2:38 AM

Depends on change to cfe (https://reviews.llvm.org/D35405) to compile properly.

  • Replaced TODO with FIXME in a comment.

Could you explain what the goal of this change is? It would help understand how it will impact the indexing work I am currently doing.

The idea is to allows us changing the way we manage/rebuild PCHs and ASTs.
ASTUnit is used for many things and has a fairly complicated and verbose interface and does a lot of mutations all other the place inside it, so making changes to it is not particularly easy.
On the other hand, it doesn't add a lot of value for clangd specifically, since we use a very simple subset of it.

The next step I planned was to allow code completion to be run in parallel with recomputing preamble/computing diagnostics. (And in general, allow multiple TUs to be processed in parallel).
That would probably involve changes to the interface, so we should definitely sync with you on the work you've been doing.
And we were also planning to look into indexing at a later point, so discussing the design there might be interesting as well.

The idea is to allows us changing the way we manage/rebuild PCHs and ASTs.
ASTUnit is used for many things and has a fairly complicated and verbose interface and does a lot of mutations all other the place inside it, so making changes to it is not particularly easy.
On the other hand, it doesn't add a lot of value for clangd specifically, since we use a very simple subset of it.

The next step I planned was to allow code completion to be run in parallel with recomputing preamble/computing diagnostics. (And in general, allow multiple TUs to be processed in parallel).
That would probably involve changes to the interface, so we should definitely sync with you on the work you've been doing.
And we were also planning to look into indexing at a later point, so discussing the design there might be interesting as well.

Thanks a lot for the additional information.

For synching with what I am doing. I am "only" looking right now at the modeling of the index and its on-disk storage, not really on the speeding up of the parsing of the TUs (i.e. the input of the index). I use index::IndexDataConsumer to feed the index and I see that in your change it is still pretty much used the same way as before so there is no problem. I will post on the indexing thread from before with a more detailed proposal from before once it is a bit further along.

For synching with what I am doing. I am "only" looking right now at the modeling of the index and its on-disk storage, not really on the speeding up of the parsing of the TUs (i.e. the input of the index). I use index::IndexDataConsumer to feed the index and I see that in your change it is still pretty much used the same way as before so there is no problem. I will post on the indexing thread from before with a more detailed proposal from before once it is a bit further along.

Great that this change is not interfering. Eager to see your proposal for the indexing.

krasimir added inline comments.Jul 19 2017, 6:12 AM
clangd/ClangdUnit.cpp
167 ↗(On Diff #106632)

Why DisableFree?

251 ↗(On Diff #106632)

What's the purpose of this local scope here?
Wouldn't &CommandLineDiagsConsumer point to garbage after the end of this local scope?

262 ↗(On Diff #106632)

We can not compute Bounds, if (Preamble).

268 ↗(On Diff #106632)

Here PreambleDiagsEngine holds a pointer to the local variable PreambleDiagnosticsConsumer.
Next, BuiltPreamble holds a reference to PreambleDiagsEngine.
Next, Preamble is created by moving BuiltPreamble.
Doesn't then Preamble hold a transitive reference to junk after this local scope?

404 ↗(On Diff #106632)

what's the purpose of this local scope?

691 ↗(On Diff #106632)

Why do we need to ensure that Decls are deserialized?

clangd/ClangdUnit.h
77 ↗(On Diff #106632)

Why is this a separate class and not just part of ClangdUnit?

Added a few comments.

clangd/ClangdUnit.cpp
167 ↗(On Diff #106632)

We rely on CompilerInstance to free the resources. It won't do that if DisableFree is set to true. Added a comment about that.

251 ↗(On Diff #106632)

This DiagnosticsEngine is not stored in CompilerInvocation created here, it is used for reporting errors during command-line parsing.
Since we don't show those errors to the user, we discard them.

Local scope is there to avoid referencing CommandLineDiagsEngine later (as it will discard diagnostics, and we actually want to have them).

262 ↗(On Diff #106632)

When Preamble is set, we need Bounds for Preamble.CanReuse.
When Preamble is not set, we need Bounds for PrecompiledPreamble::Build.

So we need them in both cases.

268 ↗(On Diff #106632)

I don't think BuiltPreamble holds a reference to PreambleDiagsEngine. Here is a list of fields of PrecompiledPreamble class:

//...
TempPCHFile PCHFile;
llvm::StringMap<PreambleFileHash> FilesInPreamble;
std::vector<char> PreambleBytes;
bool PreambleEndsAtStartOfLine;
404 ↗(On Diff #106632)

The DiagnosticsEngine used here uses default DiagnosticOptions.
A one that's properly configured after reading command-line arguments is created below, inside a prepareCompilerInstance call.
To avoid using DiagnosticsEngine configured by default, it is put into a local scope.

691 ↗(On Diff #106632)

These Decls are used in findDefinitions (specifically, they are passed to indexTopLevelDecls). I've opted for mimicing the ASTUnit behaviour here.

It is possible that visiting only local decls (i.e. not from Preamble) will not break anything. But that requires thorough testing and digging into how PCHs works (one question is, if it is possible to have decls from the same file as part of the PCH). I actually want to investigate if we actually need all that deserializing logic here, but would prefer to do that with a separate change to keep semantic changes related to this commit as small as possible.

PS. Alternatively, we could just visit all decls in ASTContext whenever we need them for findDefinitions, but a comment in ASTUnit says it's slower.

clangd/ClangdUnit.h
77 ↗(On Diff #106632)

For managing the lifetime of an AST. Specifically:

  • there is a destructor here that calls EndSourceFile to free AST-related resources,
  • all getters return references to objects that either have pointers to resources, managed by ParsedAST (getASTContext, getPreprocessor, etc) or semantically tied to this specific parsing result(getDiagnostics, PendingTopLevelDecls).

ClangdUnit is managing two things at the moment: a preamble and an AST. Since the lifetime of those two is different, it is arguably a good idea to separate them.

krasimir accepted this revision.Jul 21 2017, 5:12 AM

Looks good!

This revision is now accepted and ready to land.Jul 21 2017, 5:12 AM
This revision was automatically updated to reflect the committed changes.
petarj added a subscriber: petarj.Jul 26 2017, 8:22 AM

Can you check if this change introduced clang-x86-windows-msvc2015 buildbot failure?

Can you check if this change introduced clang-x86-windows-msvc2015 buildbot failure?

It did. Was fixed in r308959.