The heuristic is simple, only for cases we are confident.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Build result: FAILURE - Could not check out parent git hash "6171238cb7e9e8616d21ce09d463e945ce0a9fb8". It was not found in the repository. Did you configure the "Parent Revision" in Phabricator properly? Trying to apply the patch to the master branch instead...
ERROR: arc patch failed with error code 1. Check build log for details.
Log files: console-log.txt, CMakeCache.txt
High level comments based on offline discussion:
I think we want to define/formalize the concept of a near miss, to make precise the tradeoffs between false positives, false negatives, and implementability.
Not just at the individual level (an index occurrence vs a lexed occurrence) but at a whole-document level.
A possible definition, intuitively finding the locations where the indexed occurrences are now spelled:
- a near miss maps all of the name occurrences in the index onto a subset of the lexed occurrences. (names may refer to more than one thing)
- indexed occurrences must all be mapped. Result must be distinct, and preserve order. (Support only simple edits to ensure our mapping is robust)
- each indexed->lexed correspondence may change row or column but not both (increases chance our mapping is robust)
Then we can use a greedy algorithm to find a match (or memoized DFS to enumerate/compare all matches)
A good metric for comparing competing mappings might be the sum of the implied edit sizes between successive entries.
i.e. if the first three mappings are offset by 1 column down, and the last is offset by 2 columns down, then this implies an insertion of 1 line at the top of the file and 1 line between edits 3 and 4, for a total of 2.
subset is fine/good to handle as a special case - if the subset is small we don't want to find it by search.
clang-tools-extra/clangd/refactor/Rename.cpp | ||
---|---|---|
388 | I think this is just | |
567 | That doesn't mean you need std::move to get a move, it just means you can't avoid calling the move constructor. |
clang-tools-extra/clangd/refactor/Rename.h | ||
---|---|---|
96 | Oops, forgot this... And the class structure wrapping a LangOpts ref seems like a detail that can be hidden. I'd like to see:
then you can unit test the first and second and smoke test the third. Tests like Indexed = "int [[x]] = 0; void foo(int x);"; Draft = "double [[x]] = 0; void foo(double x);"; verifyRenameMatches(Indexed, Draft); |
address comments
- re-define the concept of a near miss
- add metric for evaluate how good a near miss is
clang-tools-extra/clangd/refactor/Rename.h | ||
---|---|---|
96 |
There is an existing function collectIdentifierRanges in SourceCode.cpp, and it has been unittested.
sure, I think it is sufficient to test the second one, since the second one is a simple wrapper of the getMappedRanges. |
clang-tools-extra/clangd/refactor/Rename.h | ||
---|---|---|
96 |
Did you mean "sufficient to test the first one"? |
Build result: pass - 60453 tests passed, 0 failed and 726 were skipped.
Log files: console-log.txt, CMakeCache.txt
clang-tools-extra/clangd/refactor/Rename.cpp | ||
---|---|---|
384 | the name here isn't much simpler/clearer than the code. | |
442 | it looks like itercount at the moment is counting depth and ignoring breadth. I think you want to pass by reference and increment on each call, instead. | |
442 | FWIW, this seems like a class that could be a function: vector<int> &PartialMatch ArrayRef<Range> IndexedRest // just those not covered by PartialMatch ArrayRef<Range> MatchedRest // those still to consider int &Fuel // set to 10000, decrement, bail out once it goes negative Callback // as now, though no need for the parameter | |
442 | you can bail out as soon as there are not enough lexed tokens remaining to match | |
452 | if we're actually evaluating all ranges, can we pass the index array (by reference), use it to evaluate scores, and only copy ranges for the winner? | |
456 | This works but maybe more common/obvious is to use recursion for both cases instead of the loop: if (isLineOrColumnEqual(..., Lexed[NextLexed])){ // match ... enumerate(NextMatched+1, NextLexed+1); } // don't match enumerate(NextMatched, NextLexed+1); | |
457 | Visited seems redundant: it always contains [0, NextLexed) | |
574 | Not clear to me why we're building these structures. so what about something like: Cost = 0; LastLine = -1; LastDX = 0, LastDY = 0; for (I) { DX = Mapped[I].begin.character - Indexed[I].begin.character; DY = Mapped[I].begin.line - Indexed[I].begin.line; Line = Mapped[I].begin.line; if (!(Line == LastLine && DX == LastDX)) LastDX = 0; // horizontal offsets don't carry across lines Cost += abs(DX - LastDX) + abs(DY - LastDY); LastDX, LastDY, LastLine = DX, DY, Line; } | |
clang-tools-extra/clangd/refactor/Rename.h | ||
59 | I'm not sure if authoritative is accurate here - authoritative would be from an AST or fresh index. I'd suggest "Adjusts indexed occurrences to match the current state of the file" | |
61 | nit: The index is not always up to date. | |
62 | "bad rename result" is vague. | |
62 | "This API helps" doesn't add much here - maybe merge with the next sentence? | |
66 | The details comment belongs in the implementation rather than the header, I think. | |
75 | "we use the best near miss" - it would help to describe approximately what this means. It's very hard to understand that this can fail because no candidate is "near". e.g. "we attempt to map the indexed locations onto candidates in a plausible way (e.g. guess that lines were inserted). If such a "near miss" is found, the rename is still possible" | |
77 | nit: occcurrences -> occurences | |
79 | I think this use of "patch" is a little confusing, and the name is a bit vague for the clangd namespace. | |
84 | "the mapping result" doesn't make sense in this context (the header), as you haven't exposed/described the function that computes mappings or otherwise defined the concept. (I think exposing the function would be best both for clarity and for unit-tests) | |
86 | I'd suggest moving most of this comment (at least the examples) to the cpp file. | |
103 | editCost is again not a specific enough name for this operation. The cost we're describing is specific to *adjusting* the indexed ranges to correspond to the mapped ones. Maybe renameRangeAdjustmentCost()? |
Build result: pass - 60605 tests passed, 0 failed and 726 were skipped.
Log files: console-log.txt, CMakeCache.txt
some minor fixes.
clang-tools-extra/clangd/refactor/Rename.cpp | ||
---|---|---|
452 | we could use the index array to evaluate the scores, but it would make the cost API signature a bit weird, like size_t renameRangeAdjustmentCost(ArrayRef<Range> Indexed, ArrayRef<Range> Lexed, ArrayRef<size_t> MatchedLexedIndex); | |
clang-tools-extra/clangd/refactor/Rename.h | ||
96 | I got the point here, exposed getMappedRanges and added unit tests for it. |
Build result: pass - 60605 tests passed, 0 failed and 726 were skipped.
Log files: console-log.txt, CMakeCache.txt
clang-tools-extra/clangd/refactor/Rename.cpp | ||
---|---|---|
359–371 | nit: these are not candidates, they are RenameRanges or similar | |
364 | This is worth a comment (because the error message returned describes a condition that we usually recover from) | |
452 | It's not really an API right, just a helper function exposed for testing? I don't think this is a a problem. | |
569 | why returning Expected rather than Optional here - what do we want to do with the message? | |
589 | This message isn't meaningful outside this TU - it should be a vlog, or easier to understand | |
593 | return Indexed.vec() | |
618 | distinct -> unique | |
622 | found -> find | |
622 | (again, these error messages are not useful to the user, as-is) | |
clang-tools-extra/clangd/refactor/Rename.h | ||
82 | also exposed for testing only? | |
clang-tools-extra/clangd/unittests/RenameTests.cpp | ||
866 | can you make the new line non-empty (add a comment) and change int->double instead of adding whitespace? Was hard to see what's going on here |
address comments:
- don't emit the internal messages to users, llvm::Expected => llvm::Optional
- use the index of lexed array to calculate the adjustment cost.
clang-tools-extra/clangd/refactor/Rename.cpp | ||
---|---|---|
569 | I found these error messages are useful for debugging. But you are right, end users are not interested in them, changed to vlog and Optional here. |
Build result: pass - 60627 tests passed, 0 failed and 726 were skipped.
Log files: console-log.txt, CMakeCache.txt
I'm not sure if authoritative is accurate here - authoritative would be from an AST or fresh index.
I'd suggest "Adjusts indexed occurrences to match the current state of the file"