This enables clangd to intercept compiler diagnostics and attach fixes (e.g. by
querying index). This patch adds missing includes for incomplete types e.g.
member access into class with only forward declaration. This would allow adding
missing includes for user-typed symbol names that are missing declarations
(e.g. typos) in the future.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
This looks pretty good! My main concern is latency (and variability of latency) in practice.
Particularly:
- we should avoid paying for fuzzyfind and fetching hundreds of results when we want exact-match
- limit the damage in degenerate cases: should we limit to e.g. 5 index queries per TU?
Actually, now that I think about it - if we can see a forward declaration, can't we do a lookup() instead of a fuzzyFind?
Is the problem that this doesn't generalize to the no-declaration case (where it looks like a typo)?
If we had an operation that looked like lookup() but worked by qname, would we design the feature the same way?
In particular, would we want to batch the requests to the index (lookup takes a set of IDs)? This would affect the code structure a bit. But it would make the feature cost basically O(1) instead of O(errs)...
clangd/ClangdUnit.cpp | ||
---|---|---|
295 ↗ | (On Diff #182796) | Probably worth a two-line comment explaining at a high level what this is doing. |
clangd/ClangdUnit.h | ||
65 ↗ | (On Diff #182796) | (nit: can we keep this as a plain struct? Once the number of members get large, initializing them one-by-one probably reads better anyway...) |
91 ↗ | (On Diff #182796) | Index is a reasonable (if surprising) param here, but I think we need to explicitly enable the include-fixing behavior. Even the very minimal hard-coded list of clang-tidy checks caused issues :-( And this is going to result in new network RPCs in some configs. I'd suggest keeping the Index param, and wrapping the "use include fixer?" policy along with the clang-tidy options in D55256 as some sort of "diagnostic options" struct. (Though be wary of name clash with the one in Diagnostics.h, sigh). |
clangd/Diagnostics.h | ||
105 ↗ | (On Diff #182796) | This seems like a suspicious dependency. |
clangd/IncludeFixer.cpp | ||
1 ↗ | (On Diff #182796) | new copyright notice |
39 ↗ | (On Diff #182796) | can you add a trace to this function so we know what the impact on latency is? |
66 ↗ | (On Diff #182796) | limit? |
74 ↗ | (On Diff #182796) | so issuing a bunch of fuzzy finds in sequence is clearly not ideal from a performance standpoint. |
112 ↗ | (On Diff #182796) | should be no need for toString here |
clangd/IncludeFixer.h | ||
1 ↗ | (On Diff #182796) | new copyright notice |
39 ↗ | (On Diff #182796) | nit: incomplete is one word |
unittests/clangd/IncludeFixerTests.cpp | ||
1 ↗ | (On Diff #182796) | new copyright notice; wrong filename |
26 ↗ | (On Diff #182796) | because the helpers and public APIs tested are common, I wonder if we'd be better off creating DiagnosticsTests.cpp and moving the relevant tests from both ClangdUnit and here to there. (Testing the fixes directly through the IncludeFixer class would make sense in this file, but it's awkward - I think testing through the diagnostics API is actually the right thing) |
A drop-by comment.
clangd/IncludeFixer.cpp | ||
---|---|---|
66 ↗ | (On Diff #182796) | Why do we use fuzzyFind and not lookup here? |
I switched to use lookup requests for incomplete type diagnostics. And yes, the typo errors need to use FuzzyFind or lookup by names as no USR is available.
If we had an operation that looked like lookup() but worked by qname, would we design the feature the same way?
In particular, would we want to batch the requests to the index (lookup takes a set of IDs)? This would affect the code structure a bit. But it would make the feature cost basically O(1) instead of O(errs)...
As chatted offline, we decided to leave the batching optimization as a TODO in this patch. See inline comment for more detailed response regarding performance.
clangd/ClangdUnit.h | ||
---|---|---|
91 ↗ | (On Diff #182796) | How about ParseOptions? |
clangd/IncludeFixer.cpp | ||
39 ↗ | (On Diff #182796) | Added a tracer in fixInCompleteType to avoid noise from unsupported diagnostics. |
66 ↗ | (On Diff #182796) | Switched to use lookup. The typo diagnostics (i.e. undefined symbol) can't use lookup as there is no declaration, but we can definitely use lookup for incomplete types in this patch. |
74 ↗ | (On Diff #182796) | As you suggested, batching requests from all diagnostics would definitely help. SymbolIndex already supports batch lookup by IDs, but we would need to extend SymbolIndex to support lookup by Names for the typo errors that we are handling in D57021. In order to batch index requests from all diagnostics, the existing logic around handling and storing diagnostics might also need refactoring. I added a TODO for this. We can revisit if the performance turned out to be a big problem. Another thing we could do is adding cache for index results. For example, for the code copy-paste case, one incomplete/undefined symbol can cause multiple errors. Caching should save us some unnecessary index queries. For this patch, I added a limit on the number of index requests in IncludeFixer (5 for now), which would prevent us from sending too many index requests when building AST. |
Just a bunch of nits, up to you to work out how/whether to address.
One substantive issue around classes defined outside headers - will chat offline.
clangd/ClangdServer.h | ||
---|---|---|
118 ↗ | (On Diff #183509) | Can we call this option SuggestMissingIncludes or so? include-fixer is the name of a separate tool. |
clangd/ClangdUnit.cpp | ||
305 ↗ | (On Diff #183509) | nit: I'm not sure I agree with the IncludeFixer class taking ownership of the include inserter. |
295 ↗ | (On Diff #182796) | nit: can you hoist this above the line declaring FixIncludes? |
clangd/Compiler.h | ||
49 ↗ | (On Diff #183509) | isn't an optional pointer just a pointer? |
clangd/IncludeFixer.cpp | ||
39 ↗ | (On Diff #183509) | consider moving this 5 to a constructor param, so this file can be entirely mechanism and the policy is spelled out at the construction site. |
41 ↗ | (On Diff #183509) | (You could also just write this as just a switch on Info.getID(), I'd find that a little direct/easier to read, up to you) |
61 ↗ | (On Diff #183509) | TypeName? |
63 ↗ | (On Diff #183509) | this seems an odd thing to check and way to phrase - there *was* a type in the message, but its name happens to be empty. Am I missing something? (I'd suggest just removing this) |
80 ↗ | (On Diff #183509) | fixme is obsolete |
81 ↗ | (On Diff #183509) | no need to check name |
86 ↗ | (On Diff #183509) | There's a case I'm worried about: Foo.h declares class X;, Foo.cpp defines class X{}.
I'm not sure the index really has enough structured information to detect this case, but we could check e.g. that definition header == canonical declaration header? Or that basename(IncludeHeaders) == basename(definition header)? If possible, can you add a test for this case? |
clangd/IncludeFixer.h | ||
39 ↗ | (On Diff #182796) | not done - method is still "fixInCompleteType" |
clangd/tool/ClangdMain.cpp | ||
210 ↗ | (On Diff #183509) | similarly, I'd call this -suggest-missing-includes or just -suggest-includes |
214 ↗ | (On Diff #183509) | why hidden? |