When asked for references during cross-file rename, index might return implicit references to the renamed symbol (such as those in macro expansions). To fix the incorrect behavior, this patch introduces basic filtering machinery which ensures that all ranges where renaming is about to be applied actually contains the identifier the user asked to rename.
Details
Diff Detail
Event Timeline
(apologies, the FIXME may imply this approach...)
this approach is based on an assumption: the index results are matched to the latest file content, but this is not always true in practice, our index maybe stale (index results came from an old snapshot of the file), then this approach will fail.
I think we should do it in another direction:
- add a new RefKind (something like implicit references, or named references) to clangd::Ref
- when querying the index for rename, we set a corresponding Filter in the query request (or filter out non-interesting references based on the RefKind afterwards)
I think this approach would also fail for stale index, wouldn't it?
I can totally understand why that might be slightly better for performance, but if we have no guarantees that our index is aware of implicit references and would not be able to mark those in the first place, this implementation would shield us from that.
Anyway, I am happy to learn more about why the proposed approach might be better, but I do not fully understand the concern here.
(also, this should eliminate unwanted changes caused by the index being stale, wouldn't it?)
If we go with the solution proposed by @hokein, it looks like using the current patch is an improvement of what we have now.
One big issue with the adding a new ref kind/ref modifier is that it requires modifications to Kythe-based index implementation, something that cannot be done as easily as landing this patch.
WDYT about landing something similar to this patch for now and discussing the possibilities of fixing it by storing enough information in the references later?
clang-tools-extra/clangd/SourceCode.cpp | ||
---|---|---|
217 ↗ | (On Diff #234265) | this one isn't used anywhere? |
1136 ↗ | (On Diff #234265) | SM doesn't seem to be necessary, as lex already provides that in the callback. |
clang-tools-extra/clangd/SourceCode.h | ||
301 ↗ | (On Diff #234265) | i don't think it is necessary for this function to be made public, it should be OK for it to leave in rename.cpp as a helper. |
clang-tools-extra/clangd/refactor/Rename.cpp | ||
367 | this one should go after adjustRenameRanges | |
367 | both this and adjustRenameRanges seems to be lexing the source code to get identifier locations. can we lex the file only a single time instead and make use of the result in both of the functions? as for implementation of filterRenameRanges you might wanna return intersection of RenameRanges and result of collectIdentifierRanges |
Sorry for a delay: I was trying to work with range patching heuristics and get it to work in generic case of "stale index returns more results than lexer", but in the end I converged to the simplest possible version of the intended change.
clang-tools-extra/clangd/refactor/Rename.cpp | ||
---|---|---|
431 | duplication | |
590–599 | i believe lexing might yield a superset even if index is up-to-date, void foo() { int bar; } void baz() { int ba^r; } | |
594 | exactly for the same reason above, it might not be a subset even in those circumstances. | |
596 | i believe we should do that even if it is not a subset. | |
627 | + | |
clang-tools-extra/clangd/refactor/Rename.h | ||
80 | assertion below says |
Addressed a bunch of comments to cleanup the patch and replied to ask for clarification of several unresolved comments.
clang-tools-extra/clangd/refactor/Rename.cpp | ||
---|---|---|
590–599 | Good point. Any local variable (IIRC we don't store local variables in index, so it's not there) having the same identifier as the renamed symbol might cause the indexed ranges to be a subset of lexed ranges. However,
I tried to describe the case when lexer will return a _subset_ of indexed results, which happens in practice. Is the suggestion to change the comments you think are misleading? | |
596 | Maybe my understanding of getMappedRanges is incorrect, but I suppose this will happen in case indexed ranges are subset of lexed ranges in some form (i.e. no need to find exact matches explicitly). Am I missing something? |
assertion below says
assert(Indexed.size() <= Lexed.size());