This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Handle additional includes while parsing ASTs
ClosedPublic

Authored by kadircet on Apr 7 2020, 5:33 AM.

Details

Summary

Depends on D77392.

Enables building ASTs with stale preambles by handling additional preamble
includes. Sets the correct location information for those imaginary includes so
that features like gotodef/documentlink keeps functioning propoerly.

Diff Detail

Event Timeline

kadircet created this revision.Apr 7 2020, 5:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2020, 5:33 AM
kadircet planned changes to this revision.Apr 7 2020, 1:01 PM

this doesn't handle source locations correctly in presence of deleted headers.

kadircet updated this revision to Diff 260228.Apr 27 2020, 12:19 AM
  • Use PreamblePatch
  • Add more tests
  • Handle deleted includes

this is ready for review now, to explain the flow:

While creating a patch we also record remaining includes from Baseline left in Modified, as the new ones will be put into the patch already.
That way ParsedAST can keep track of latest state of the preamble + main file includes, with the patched includes ending up in the latter.

It doesn't modify include depths, as it is only used by code completion. If need be we can prune the tree to get rid of deleted includes.

kadircet marked an inline comment as done.Apr 27 2020, 3:28 AM
kadircet added inline comments.
clang-tools-extra/clangd/Preamble.h
117

i am not happy about this interface. I was thinking about a std::vector<Inclusion> remainingBaselineIncludes() const but didn't go for it to keep the concept of patching.

I suppose the former is more intuitive though, WDYT?

This is excellent, a pretty nice model in the end!
Good job puzzling it out, when we discussed the idea in the past I imagined this part would be a pile of hacks.

Thanks for the offline help understanding this, too...

clang-tools-extra/clangd/ParsedAST.cpp
273

As discussed offline, might be clearer not to have the patch exist unless the preamble does.

clang-tools-extra/clangd/ParsedAST.h
51 ↗(On Diff #260228)

ah, this is going to conflict with https://github.com/llvm/llvm-project/commit/6880c4dfa3981ec26d44b5f4844b641342a4e3e8
Sorry about that, but that change also ParseInputs available here so I think you can just drop your changes.

clang-tools-extra/clangd/Preamble.cpp
279

BTW do we want to check isPreambleCompatible and bail out early if it is, or is that the caller's job?

337

This explicitly looks at the existing includes based on spelling, and not whether the include was resolved during scanning or not.

One implication is that if you insert an #include that was already transitively included (which IIUC will hit the preamble VFS cache and thus be resolved) then we're not going to take advantage of this to record its resolution now.

Instead we're going to parse it along with the mainfile, and... well, I *hope* everything works out:

  • do we pay for the stat, or is the preamble's stat cache still in effect?
  • if the file is include-guarded, we're not going to read or parse it I guess?
  • if it's not include-guarded but #imported, then again we won't re-read or re-parse it because we use the same directive?
  • if the file isn't include-guarded, I guess we must read it (because preamble doesn't contain the actual buffer) and then parse it, but such is life.
344–345

nice to have a comment similar to the one on line 333 for this case

// Include is new in the modified preamble.
// Inject it into the patch and use #line to set the presumed location to where it is spelled
clang-tools-extra/clangd/Preamble.h
114

This comment explains what but not why, and the why is pretty non-obvious here.
Assuming we switch to returning a vector, I'd call it preambleIncludes() and say something like:

Returns #include directives from the \c Modified preamble that were resolved using the \c Baseline preamble.
This covers the new locations of inclusions that were moved around, but not inclusions of new files.
Those will be recorded when parsing the main file: the includes in the injected section will be resolved back to their spelled positions in the main file using the presumed-location mechanism.

("injected section" isn't a term we've defined anywhere, it'd be good to give this idea some name in the class comment as it's the main mechanism behind preamble patching)

117

Looking at the callsite, I think I like returning a vector better. buildAST is copyng a data structure specifically to be patched, and the patch is just overwriting it. The interface looks good but doesn't really reflect what's going on.

To make the vector thing work cleanly, I think we should have it always overwrite. This implies it needs to have the preamble's includes, so the default constructor needs to go away. I'd suggest adding a static factory PreamblePatch::unmodified(const PreambleData&) and hiding all constructors.
This means we can't have a PreamblePatch instance with no preamble, which means dealing with Optional in places but it also seems clearer.

kadircet marked 8 inline comments as done.May 4 2020, 12:16 AM
kadircet added inline comments.
clang-tools-extra/clangd/Preamble.cpp
279

I am planning to leave it to the caller. Eventually this PreamblePatch should be generated by TUScheduler, in which we can check for staleness and issue either an unmodified patch or create a patch and use it until new preamble is ready or patch is invalidated.

337

This explicitly looks at the existing includes based on spelling, and not whether the include was resolved during scanning or not.

Right. Scanning won't resolve anything as we pass an empty FS to preprocessor.

Instead we're going to parse it along with the mainfile, and... well, I *hope* everything works out:
do we pay for the stat, or is the preamble's stat cache still in effect?

It depends on the user of PreamblePatch, while building an AST we build preprocessor with cached fs coming from preamble.

if the file is include-guarded, we're not going to read or parse it I guess?

Yes that's the case. We'll only see the include directive.

if it's not include-guarded but #imported, then again we won't re-read or re-parse it because we use the same directive?

Yes again.

if the file isn't include-guarded, I guess we must read it (because preamble doesn't contain the actual buffer) and then parse it, but such is life.

Unfortunately yes again, but as you said i am not sure if that's a case we should try and recover from as i can only see two use cases:

  • You include the header in a different preprocessor state and intentionally want it to parse again, hopefully we'll handle this after macro patching.
  • You deleted the include from a transitively included file and inserted it back into this file again. Well, now the hell breaks loose until we build the preamble again. In the worst case AST will turn into a garbage and we won't be able to serve anything but code completions, which is the state of the world without preamble patching, we are just consuming more cpu now.
344–345

handled in D78743

kadircet updated this revision to Diff 261745.May 4 2020, 12:16 AM
kadircet marked an inline comment as done.
  • Address comments
sammccall accepted this revision.May 19 2020, 3:53 AM

Sorry for the delay. I think this is good though I have a nagging feeling I don't understand all the implications yet :-)

This revision is now accepted and ready to land.May 19 2020, 3:53 AM
kadircet updated this revision to Diff 266215.May 26 2020, 8:07 AM
  • Fix windows build bots in the face of #import directives
This revision was automatically updated to reflect the committed changes.

Hi @kadircet!
I am investigating a failure of PatchesAdditionalIncludes test that we got internally. It's a failed assert in ReplayPreamble::replay.
Our clangd source code is for all practical purposes identical to upstream version but not so with clang source. Specifically what we do is that in CompilerInstance::createPreprocessor we always add one particular callback.
This means that when in the test we are calling buildPreamble for TestTU TU with empty buffer we never hit the early return in ReplayPreamble::attach() (ParsedAST.cpp:124) like upstream version does and proceed to create a new ReplayPreamble object with PreambleBounds of size() == 0 which leads to ReplayPreamble::MainFileTokens to be empty and later we hit failed assert in ReplayPreamble::replay about assert(HashTok != MainFileTokens.end() && ...).

Now, while I can just tweak either ReplayPreamble::attach() or remove the PP callback in the test internally I am wondering whether you consider empty preamble & PP with callbacks valid use-case of ReplayPreamble and worth a fix.

This is where we are creating the empty MainFileTokens:

* frame #0: 0x0000000103337649 ClangdTests` clang::clangd::(anonymous namespace)::ReplayPreamble::ReplayPreamble(this=0x0000000114f45da0, Includes=0x00007ffeefbfc678, Delegate=0x0000000114f3bd80, SM=0x0000000114f3dd80, PP=0x0000000115850218, LangOpts=0x0000000114f38eb0, PB=0x00007ffeefbfc658)  + 169 at ParsedAST.cpp:142
  frame #1: 0x000000010333756d ClangdTests` clang::clangd::(anonymous namespace)::ReplayPreamble::ReplayPreamble(this=0x0000000114f45da0, Includes=0x00007ffeefbfc678, Delegate=0x0000000114f3bd80, SM=0x0000000114f3dd80, PP=0x0000000115850218, LangOpts=0x0000000114f38eb0, PB=0x00007ffeefbfc658)  + 77 at ParsedAST.cpp:139
  frame #2: 0x0000000103334fa7 ClangdTests` clang::clangd::(anonymous namespace)::ReplayPreamble::attach(Includes=0x00007ffeefbfc678, Clang=0x0000000114f33ea0, PB=0x00007ffeefbfc658)  + 183 at ParsedAST.cpp:126
  frame #3: 0x0000000103334189 ClangdTests` clang::clangd::ParsedAST::build(Filename=(Data = "/clangd-test/foo.cpp", Length = 20), Inputs=0x00007ffeefbfdf98, CI=nullptr, CompilerInvocationDiags=ArrayRef<clang::clangd::Diag> @ 0x00007ffeefbfd830, Preamble=std::__1::shared_ptr<const clang::clangd::PreambleData>::element_type @ 0x0000000114f3e728 strong=2 weak=1)  + 3897 at ParsedAST.cpp:385
  frame #4: 0x00000001006f1032 ClangdTests` clang::clangd::(anonymous namespace)::ParsedASTTest_PatchesAdditionalIncludes_Test::TestBody(this=0x0000000114f04090)  + 1778 at ParsedASTTests.cpp:477

This is the failed assert:

* frame #4: 0x0000000103337a0c ClangdTests` clang::clangd::(anonymous namespace)::ReplayPreamble::replay(this=0x0000000114f45da0)  + 556 at ParsedAST.cpp:182
  frame #5: 0x000000010333777c ClangdTests` clang::clangd::(anonymous namespace)::ReplayPreamble::FileChanged(this=0x0000000114f45da0, Loc=(ID = 74), Reason=ExitFile, Kind=C_User, PrevFID=(ID = 2))  + 156 at ParsedAST.cpp:166
  frame #6: 0x0000000101b7900a ClangdTests` clang::PPChainedCallbacks::FileChanged(this=0x0000000116204080, Loc=(ID = 74), Reason=ExitFile, FileType=C_User, PrevFID=(ID = 2))  + 90 at PPCallbacks.h:390
  frame #7: 0x0000000101b79046 ClangdTests` clang::PPChainedCallbacks::FileChanged(this=0x00000001162040c0, Loc=(ID = 74), Reason=ExitFile, FileType=C_User, PrevFID=(ID = 2))  + 150 at PPCallbacks.h:391
  frame #8: 0x0000000101b79046 ClangdTests` clang::PPChainedCallbacks::FileChanged(this=0x0000000116204100, Loc=(ID = 74), Reason=ExitFile, FileType=C_User, PrevFID=(ID = 2))  + 150 at PPCallbacks.h:391
  frame #9: 0x0000000101b79046 ClangdTests` clang::PPChainedCallbacks::FileChanged(this=0x0000000116204160, Loc=(ID = 74), Reason=ExitFile, FileType=C_User, PrevFID=(ID = 2))  + 150 at PPCallbacks.h:391
  frame #10: 0x0000000101b79046 ClangdTests` clang::PPChainedCallbacks::FileChanged(this=0x0000000116205480, Loc=(ID = 74), Reason=ExitFile, FileType=C_User, PrevFID=(ID = 2))  + 150 at PPCallbacks.h:391
  frame #11: 0x0000000101b9a590 ClangdTests` clang::Preprocessor::HandleEndOfFile(this=0x0000000115850218, Result=0x0000000116808210, isEndOfMacro=false)  + 3904 at PPLexerChange.cpp:469
  frame #12: 0x0000000101b38713 ClangdTests` clang::Lexer::LexEndOfFile(this=0x0000000116206330, Result=0x0000000116808210, CurPtr="")  + 931 at Lexer.cpp:2833
  frame #13: 0x0000000101b39c44 ClangdTests` clang::Lexer::LexTokenInternal(this=0x0000000116206330, Result=0x0000000116808210, TokAtPhysicalStartOfLine=true)  + 420 at Lexer.cpp:3265
  frame #14: 0x0000000101b382f8 ClangdTests` clang::Lexer::Lex(this=0x0000000116206330, Result=0x0000000116808210)  + 216 at Lexer.cpp:3216
  frame #15: 0x0000000101bd7396 ClangdTests` clang::Preprocessor::Lex(this=0x0000000115850218, Result=0x0000000116808210)  + 118 at Preprocessor.cpp:900
  frame #16: 0x0000000101b75ef1 ClangdTests` clang::Preprocessor::CachingLex(this=0x0000000115850218, Result=0x0000000116808210)  + 289 at PPCaching.cpp:63
  frame #17: 0x0000000101bd73d5 ClangdTests` clang::Preprocessor::Lex(this=0x0000000115850218, Result=0x0000000116808210)  + 181 at Preprocessor.cpp:906
  frame #18: 0x0000000104e3c21c ClangdTests` clang::Parser::TryConsumeToken(this=0x0000000116808200, Expected=semi)  + 204 at Parser.h:489
  frame #19: 0x0000000104e3bf1a ClangdTests` clang::Parser::ExpectAndConsumeSemi(this=0x0000000116808200, DiagID=1526)  + 42 at Parser.cpp:162
  frame #20: 0x0000000104d5f0e0 ClangdTests` clang::Parser::ParseDeclGroup(this=0x0000000116808200, DS=0x00007ffeefbfb6e8, Context=FileContext, DeclEnd=0x0000000000000000, FRI=0x0000000000000000)  + 3168 at ParseDecl.cpp:2243
  frame #21: 0x0000000104e42c48 ClangdTests` clang::Parser::ParseDeclOrFunctionDefInternal(this=0x0000000116808200, attrs=0x00007ffeefbfbc80, DS=0x00007ffeefbfb6e8, AS=AS_none)  + 1704 at Parser.cpp:1109
  frame #22: 0x0000000104e42170 ClangdTests` clang::Parser::ParseDeclarationOrFunctionDefinition(this=0x0000000116808200, attrs=0x00007ffeefbfbc80, DS=0x0000000000000000, AS=AS_none)  + 144 at Parser.cpp:1125
  frame #23: 0x0000000104e411fa ClangdTests` clang::Parser::ParseExternalDeclaration(this=0x0000000116808200, attrs=0x00007ffeefbfbc80, DS=0x0000000000000000)  + 3642 at Parser.cpp:945
  frame #24: 0x0000000104e3f254 ClangdTests` clang::Parser::ParseTopLevelDecl(this=0x0000000116808200, Result=0x00007ffeefbfbde8, IsFirstDecl=false)  + 1828 at Parser.cpp:693
  frame #25: 0x0000000104d465e3 ClangdTests` clang::ParseAST(S=0x000000011680c200, PrintStats=false, SkipFunctionBodies=false)  + 595 at ParseAST.cpp:158
  frame #26: 0x0000000101992662 ClangdTests` clang::ASTFrontendAction::ExecuteAction(this=0x0000000114f349a0)  + 322 at FrontendAction.cpp:1066
  frame #27: 0x0000000101991b78 ClangdTests` clang::FrontendAction::Execute(this=0x0000000114f349a0)  + 136 at FrontendAction.cpp:959
  frame #28: 0x00000001033343ab ClangdTests` clang::clangd::ParsedAST::build(Filename=(Data = "/clangd-test/foo.cpp", Length = 20), Inputs=0x00007ffeefbfdf98, CI=nullptr, CompilerInvocationDiags=ArrayRef<clang::clangd::Diag> @ 0x00007ffeefbfd830, Preamble=std::__1::shared_ptr<const clang::clangd::PreambleData>::element_type @ 0x0000000114f3e728 strong=2 weak=1)  + 4443 at ParsedAST.cpp:415
  frame #29: 0x00000001006f1032 ClangdTests` clang::clangd::(anonymous namespace)::ParsedASTTest_PatchesAdditionalIncludes_Test::TestBody(this=0x0000000114f04090)  + 1778 at ParsedASTTests.cpp:477
uabelho added a subscriber: uabelho.Jun 4 2020, 2:17 AM
uabelho added inline comments.
clang-tools-extra/clangd/CodeComplete.cpp
1034

Hi!

When compiling with gcc 7.4 I see a warning that I think originates from this line:

[4032/4668] Building CXX object tools/clang/tools/extra/clangd/CMakeFiles/obj.clangDaemon.dir/CodeComplete.cpp.o
In file included from /data/repo/master/llvm/include/llvm/ADT/STLExtras.h:19:0,
                 from /data/repo/master/llvm/include/llvm/ADT/StringRef.h:12,
                 from /data/repo/master/clang-tools-extra/clangd/URI.h:12,
                 from /data/repo/master/clang-tools-extra/clangd/Protocol.h:26,
                 from /data/repo/master/clang-tools-extra/clangd/Headers.h:12,
                 from /data/repo/master/clang-tools-extra/clangd/CodeComplete.h:18,
                 from /data/repo/master/clang-tools-extra/clangd/CodeComplete.cpp:20:
/data/repo/master/llvm/include/llvm/ADT/Optional.h: In instantiation of 'void llvm::optional_detail::OptionalStorage<T, <anonymous> >::emplace(Args&& ...) [with Args = {const clang::clangd::PreamblePatch}; T = const clang::clangd::PreamblePatch; bool <anonymous> = false]':
/data/repo/master/llvm/include/llvm/ADT/Optional.h:55:14:   required from 'llvm::optional_detail::OptionalStorage<T, <anonymous> >::OptionalStorage(llvm::optional_detail::OptionalStorage<T, <anonymous> >&&) [with T = const clang::clangd::PreamblePatch; bool <anonymous> = false]'
/data/repo/master/llvm/include/llvm/ADT/Optional.h:228:3:   required from here
/data/repo/master/llvm/include/llvm/ADT/Optional.h:89:12: warning: cast from type 'const clang::clangd::PreamblePatch*' to type 'void*' casts away qualifiers [-Wcast-qual]
     ::new ((void *)std::addressof(value)) T(std::forward<Args>(args)...);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~