expandedTokens(SourceRange) used to do a binary search to get the
expanded tokens belonging to a source range. Each binary search uses
isBeforeInTranslationUnit to order two source locations. This is
inherently very slow.
By profiling clangd we found out that users like clangd::SelectionTree
spend 95% of time in isBeforeInTranslationUnit. Also it is worth
noting that users of expandedTokens(SourceRange) majorly use ranges
provided by AST to query this funciton. The ranges provided by AST are
token ranges (starting at the beginning of a token and ending at the
beginning of another token).
Therefore we can avoid the binary search in majority of the cases by
maintaining an index of ExpandedToken by their SourceLocations. We still
do binary search for ranges which are not token ranges but such
instances are quite low.
For a cachedAST case, we have wins even for normal clangd usage. Invoking location based features has same performance both in batch mode and normal server mode.
Performance:
~/build/bin/clangd --check=clang/lib/Serialization/ASTReader.cpp
Before: Took 2:10s to complete.
Now: Took 1:13s to complete.
+45% faster.
For medium size buffers ( like XRefs.cpp) the iteration is negligible as compared to ParsedAST. The wins are lower 4-8% for smaller buffers.
~/build/bin/clangd -- clang-tools-extra/clangd/XRefs.cpp
Preamble: 5s
ParsedAST: 0.54s
Location Features: 0.51s (5% speedup)
ExpandedToken iteration: 0.00018s (0.03% of ParsedAST)
Huge win for Macro heavy files (like tests)
~/build/bin/clangd --check=clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
Preamble: 2.3s
ParsedAST: 0.23s
ExpandedToken iteration: 0.002s (0.85% of ParsedAST)
Location feature: 1702s to 363s (+80% faster)
Hmm, the primary doc (first sentence) should probably be about the effect rather than implementation details.
Maybe just "Builds a cache to make expandedToken(SourceRange) faster."?