This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Make use of syntax tokens in ReplayPreamble
ClosedPublic

Authored by kadircet on Feb 19 2020, 8:41 AM.

Details

Summary

Replace usage of RawLexer with syntax tokens inside ReplayPreamble.

Diff Detail

Event Timeline

kadircet created this revision.Feb 19 2020, 8:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2020, 8:41 AM
sammccall added inline comments.Feb 19 2020, 9:10 AM
clang-tools-extra/clangd/ParsedAST.cpp
137

tokenizing the whole file an extra time on every AST build seems a bit sad - this is considerably more lexing than we were doing before. Probably doesn't matter?

We could trim this to the preamble bounds I guess. Or even compute it once when the preamble is built, since we assume all the bytes are the same? I guess SourceLocations are the problem... we could just translate offsets into the new SM, but that gets messy.
On the other hand, assuming the preamble isn't going to change at all seems like an assumption not long for this world.
On the first hand again, maybe we'll have to revisit looots of stuff (go to definition and everything) once that assumption breaks anyway.

176

why raw encoding?

178

this looks like a linear search for each #include

189

Not clear what "imitate the PP logic" means.
We construct a fake 'import'/'include' token... nobody cares about clang::Token::Flags.

kadircet marked 7 inline comments as done.Feb 21 2020, 12:31 PM
kadircet added inline comments.
clang-tools-extra/clangd/ParsedAST.cpp
137

Implemented a way to partially tokenize a file in D74962.

On the other hand, assuming the preamble isn't going to change at all seems like an assumption not long for this world.

It should be okay for replaypreambles as only clang tidy checkers depends on this logic and we are not planning to emit diagnostics with stale preambles.

178

made it logarithmic instead, we can also make it linear in total if we decide to rely on the fact that MainFileIncludes are sorted. I believe it is currently true but never promised by the include collector.

189

it was refering to the fact that we were performing the PP.LookupIdentifierInfo call to set kind etc.

kadircet updated this revision to Diff 245960.Feb 21 2020, 12:31 PM
kadircet marked 3 inline comments as done.
  • Address comments.
  • Add tests by mimicking a clang-tidy check.
  • only tokenize the preamble section, not the whole file.
kadircet updated this revision to Diff 245962.Feb 21 2020, 12:34 PM
  • Update comment
sammccall accepted this revision.Mar 4 2020, 1:31 AM
sammccall added inline comments.
clang-tools-extra/clangd/ParsedAST.cpp
182

nit: IncludeTok

clang-tools-extra/clangd/unittests/ParsedASTTests.cpp
388

I think it would be clearer to have parallel named point/range lists rather than doing index math.

So the annotated code would be pretty verbose like: $hash^#$include[[import]] $filerange^"$file[[bar.h]]"...
But I think the setup/asserts would be clearer.

This revision is now accepted and ready to land.Mar 4 2020, 1:31 AM
kadircet updated this revision to Diff 248124.Mar 4 2020, 2:00 AM
kadircet marked 2 inline comments as done.
  • More annotations to the test
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Mar 4 2020, 2:33 AM

Breaks win: http://45.33.8.238/win/9705/step_9.txt

(Or your other change that landed at the same time)