This is an archive of the discontinued LLVM Phabricator instance.

Build result: FAILURE - Could not check out parent git hash "6171238cb7e9e8616d21ce09d463e945ce0a9fb8". It was not found in the repository. Did you configure the "Parent Revision" in Phabricator properly? Trying to apply the patch to the master branch instead...

ERROR: arc patch failed with error code 1. Check build log for details.
Log files: console-log.txt, CMakeCache.txt

Harbormaster failed remote builds in B41368: Diff 230628!Nov 22 2019, 3:59 AM

rebase

High level comments based on offline discussion:

I think we want to define/formalize the concept of a near miss, to make precise the tradeoffs between false positives, false negatives, and implementability.
Not just at the individual level (an index occurrence vs a lexed occurrence) but at a whole-document level.
A possible definition, intuitively finding the locations where the indexed occurrences are now spelled:

a near miss maps all of the name occurrences in the index onto a subset of the lexed occurrences. (names may refer to more than one thing)
indexed occurrences must all be mapped. Result must be distinct, and preserve order. (Support only simple edits to ensure our mapping is robust)
each indexed->lexed correspondence may change row or column but not both (increases chance our mapping is robust)

Then we can use a greedy algorithm to find a match (or memoized DFS to enumerate/compare all matches)

A good metric for comparing competing mappings might be the sum of the implied edit sizes between successive entries.
i.e. if the first three mappings are offset by 1 column down, and the last is offset by 2 columns down, then this implies an insertion of 1 line at the top of the file and 1 line between edits 3 and 4, for a total of 2.

subset is fine/good to handle as a special case - if the subset is small we don't want to find it by search.

clang-tools-extra/clangd/refactor/Rename.cpp
388	I think this is just return std::includes(Superset.begin, Superset.end(), Subset.begin(), Subset.end()); (probably clear enough to just inline to the callsite)
567	That doesn't mean you need std::move to get a move, it just means you can't avoid calling the move constructor. https://godbolt.org/z/Rh-CvT

sammccall added inline comments.Dec 2 2019, 7:14 AM

clang-tools-extra/clangd/refactor/Rename.h
96	Oops, forgot this... I think the public API isn't quite right here - exposing parts for testing is fine, but the classification itself isn't fine grained enough I think. (Too easy to write a test that "passes" but the actual mapping found isn't the right one). And the class structure wrapping a LangOpts ref seems like a detail that can be hidden. I'd like to see: a function that returns the lexed ranges from a StringRef/LangOpts a function that constructs the mapping given two sequences of ranges (like `getMappedRanges(ArrayRef<Range>, ArrayRef<Range>) -> vector<Range>` a function that ties these together to the data structures we care about (i.e. taking Code + identifier + LangOpts + ArrayRef<Ref> or so) then you can unit test the first and second and smoke test the third. Tests like Indexed = "int [[x]] = 0; void foo(int x);"; Draft = "double [[x]] = 0; void foo(double x);"; verifyRenameMatches(Indexed, Draft);

address comments

re-define the concept of a near miss
add metric for evaluate how good a near miss is

hokein added inline comments.Dec 4 2019, 5:53 AM

clang-tools-extra/clangd/refactor/Rename.h
96	a function that returns the lexed ranges from a StringRef/LangOpts There is an existing function `collectIdentifierRanges` in SourceCode.cpp, and it has been unittested. a function that constructs the mapping given two sequences of ranges (like getMappedRanges(ArrayRef<Range>, ArrayRef<Range>) -> vector<Range> a function that ties these together to the data structures we care about (i.e. taking Code + identifier + LangOpts + ArrayRef<Ref> or so) sure, I think it is sufficient to test the second one, since the second one is a simple wrapper of the `getMappedRanges`.

sammccall added inline comments.Dec 4 2019, 6:02 AM

clang-tools-extra/clangd/refactor/Rename.h
96	sure, I think it is sufficient to test the second one, since the second one is a simple wrapper of the `getMappedRanges`. Did you mean "sufficient to test the first one"? Testing the second one is certainly sufficient, but tests more than it needs to (particularly the lexing bits again).

Build result: pass - 60453 tests passed, 0 failed and 726 were skipped.

Log files: console-log.txt, CMakeCache.txt

Harbormaster completed remote builds in B41848: Diff 232101.Dec 4 2019, 6:25 AM

sammccall added inline comments.Dec 5 2019, 10:45 AM

clang-tools-extra/clangd/refactor/Rename.cpp
384	the name here isn't much simpler/clearer than the code. `impliesSimpleEdit`?
442	it looks like itercount at the moment is counting depth and ignoring breadth. I think you want to pass by reference and increment on each call, instead.
442	FWIW, this seems like a class that could be a function: params are: vector<int> &PartialMatch ArrayRef<Range> IndexedRest // just those not covered by PartialMatch ArrayRef<Range> MatchedRest // those still to consider int &Fuel // set to 10000, decrement, bail out once it goes negative Callback // as now, though no need for the parameter
442	you can bail out as soon as there are not enough lexed tokens remaining to match
452	if we're actually evaluating all ranges, can we pass the index array (by reference), use it to evaluate scores, and only copy ranges for the winner?
456	This works but maybe more common/obvious is to use recursion for both cases instead of the loop: if (isLineOrColumnEqual(..., Lexed[NextLexed])){ // match ... enumerate(NextMatched+1, NextLexed+1); } // don't match enumerate(NextMatched, NextLexed+1);
457	Visited seems redundant: it always contains [0, NextLexed)
574	Not clear to me why we're building these structures. The cost is a sum of implied edits, implied edits are a function of (last displacement, current displacement, are they on the same line) so what about something like: Cost = 0; LastLine = -1; LastDX = 0, LastDY = 0; for (I) { DX = Mapped[I].begin.character - Indexed[I].begin.character; DY = Mapped[I].begin.line - Indexed[I].begin.line; Line = Mapped[I].begin.line; if (!(Line == LastLine && DX == LastDX)) LastDX = 0; // horizontal offsets don't carry across lines Cost += abs(DX - LastDX) + abs(DY - LastDY); LastDX, LastDY, LastLine = DX, DY, Line; }
clang-tools-extra/clangd/refactor/Rename.h
59	I'm not sure if authoritative is accurate here - authoritative would be from an AST or fresh index. I'd suggest "Adjusts indexed occurrences to match the current state of the file"
61	nit: The index is not always up to date.
62	"bad rename result" is vague. Maybe "Blindly editing at the locations reported by the index may mangle the code in such cases".
62	"This API helps" doesn't add much here - maybe merge with the next sentence? "This function determines whether the indexed occurrences can be applied to this file, and heuristically..."
66	The details comment belongs in the implementation rather than the header, I think.
75	"we use the best near miss" - it would help to describe approximately what this means. It's very hard to understand that this can fail because no candidate is "near". e.g. "we attempt to map the indexed locations onto candidates in a plausible way (e.g. guess that lines were inserted). If such a "near miss" is found, the rename is still possible"
77	nit: occcurrences -> occurences nit: break before (
79	I think this use of "patch" is a little confusing, and the name is a bit vague for the clangd namespace. `adjustRenameRanges`?
84	"the mapping result" doesn't make sense in this context (the header), as you haven't exposed/described the function that computes mappings or otherwise defined the concept. (I think exposing the function would be best both for clarity and for unit-tests)
86	I'd suggest moving most of this comment (at least the examples) to the cpp file.
103	`editCost` is again not a specific enough name for this operation. The cost we're describing is specific to adjusting the indexed ranges to correspond to the mapped ones. Maybe `renameRangeAdjustmentCost()`?

address reveiw comments.

Build result: pass - 60605 tests passed, 0 failed and 726 were skipped.

Log files: console-log.txt, CMakeCache.txt

Harbormaster completed remote builds in B42073: Diff 232743.Dec 8 2019, 1:02 PM

some minor fixes.

clang-tools-extra/clangd/refactor/Rename.cpp
452	we could use the index array to evaluate the scores, but it would make the cost API signature a bit weird, like `size_t renameRangeAdjustmentCost(ArrayRef<Range> Indexed, ArrayRef<Range> Lexed, ArrayRef<size_t> MatchedLexedIndex);`
clang-tools-extra/clangd/refactor/Rename.h
96	I got the point here, exposed `getMappedRanges` and added unit tests for it.

Build result: pass - 60605 tests passed, 0 failed and 726 were skipped.

Log files: console-log.txt, CMakeCache.txt

Harbormaster completed remote builds in B42074: Diff 232744.Dec 8 2019, 1:21 PM

sammccall accepted this revision.Dec 9 2019, 4:53 AM

sammccall added inline comments.

clang-tools-extra/clangd/refactor/Rename.cpp
359	nit: these are not candidates, they are `RenameRanges` or similar
364	This is worth a comment (because the error message returned describes a condition that we usually recover from)
452	It's not really an API right, just a helper function exposed for testing? I don't think this is a a problem.
569	why returning Expected rather than Optional here - what do we want to do with the message?
589	This message isn't meaningful outside this TU - it should be a vlog, or easier to understand
593	return Indexed.vec()
618	distinct -> unique
622	found -> find
622	(again, these error messages are not useful to the user, as-is)
clang-tools-extra/clangd/refactor/Rename.h
82	also exposed for testing only?
clang-tools-extra/clangd/unittests/RenameTests.cpp
866	can you make the new line non-empty (add a comment) and change int->double instead of adding whitespace? Was hard to see what's going on here

This revision is now accepted and ready to land.Dec 9 2019, 4:53 AM

address comments:

don't emit the internal messages to users, llvm::Expected => llvm::Optional
use the index of lexed array to calculate the adjustment cost.

hokein added inline comments.Dec 9 2019, 7:01 AM

clang-tools-extra/clangd/refactor/Rename.cpp
569	I found these error messages are useful for debugging. But you are right, end users are not interested in them, changed to vlog and Optional here.

Build result: pass - 60627 tests passed, 0 failed and 726 were skipped.

Log files: console-log.txt, CMakeCache.txt

Harbormaster completed remote builds in B42129: Diff 232844.Dec 9 2019, 7:19 AM

Closed by commit rG891f82222bb8: [clangd] Implement range patching heuristics for cross-file rename. (authored by hokein). · Explain WhyDec 9 2019, 8:06 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

clang-tools-extra/

clangd/

refactor/

Rename.h

27 lines

Rename.cpp

161 lines

unittests/

RenameTests.cpp

301 lines

Diff 232744

clang-tools-extra/clangd/refactor/Rename.h

	//===--- Rename.h - Symbol-rename refactorings -------------------- C++--===//			//===--- Rename.h - Symbol-rename refactorings -------------------- C++--===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_REFACTOR_RENAME_H			#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_REFACTOR_RENAME_H
	#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_REFACTOR_RENAME_H			#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_REFACTOR_RENAME_H

	#include "Path.h"			#include "Path.h"
	#include "Protocol.h"			#include "Protocol.h"
	#include "SourceCode.h"			#include "SourceCode.h"
				#include "clang/Basic/LangOptions.h"
	#include "clang/Tooling/Core/Replacement.h"			#include "clang/Tooling/Core/Replacement.h"
	#include "llvm/Support/Error.h"			#include "llvm/Support/Error.h"

	namespace clang {			namespace clang {
	namespace clangd {			namespace clangd {
	class ParsedAST;			class ParsedAST;
	class SymbolIndex;			class SymbolIndex;

	Show All 27 Lines
	/// Generates rename edits that replaces all given occurrences with the			/// Generates rename edits that replaces all given occurrences with the
	/// NewName.			/// NewName.
	/// Exposed for testing only.			/// Exposed for testing only.
	llvm::Expected<Edit> buildRenameEdit(llvm::StringRef AbsFilePath,			llvm::Expected<Edit> buildRenameEdit(llvm::StringRef AbsFilePath,
	llvm::StringRef InitialCode,			llvm::StringRef InitialCode,
	std::vector<Range> Occurrences,			std::vector<Range> Occurrences,
	llvm::StringRef NewName);			llvm::StringRef NewName);

				/// Adjusts indexed occurrences to match the current state of the file.
				sammccallUnsubmitted Done Reply Inline Actions I'm not sure if authoritative is accurate here - authoritative would be from an AST or fresh index. I'd suggest "Adjusts indexed occurrences to match the current state of the file" sammccall: I'm not sure if authoritative is accurate here - authoritative would be from an AST or fresh…
				///
				/// The Index is not always up to date. Blindly editing at the locations
				sammccallUnsubmitted Done Reply Inline Actions nit: The index is not always up to date. sammccall: nit: The index is not always up to date.
				/// reported by the index may mangle the code in such cases.
				sammccallUnsubmitted Done Reply Inline Actions "bad rename result" is vague. Maybe "Blindly editing at the locations reported by the index may mangle the code in such cases". sammccall: "bad rename result" is vague. Maybe "Blindly editing at the locations reported by the index may…
				sammccallUnsubmitted Done Reply Inline Actions "This API helps" doesn't add much here - maybe merge with the next sentence? "This function determines whether the indexed occurrences can be applied to this file, and heuristically..." sammccall: "This API helps" doesn't add much here - maybe merge with the next sentence? "This function…
				/// This function determines whether the indexed occurrences can be applied to
				/// this file, and heuristically repairs the occurrences if necessary.
				///
				/// The API assumes that Indexed contains only named occurrences (each
				sammccallUnsubmitted Done Reply Inline Actions The details comment belongs in the implementation rather than the header, I think. sammccall: The details comment belongs in the implementation rather than the header, I think.
				/// occurrence has the same length).
				llvm::Expected<std::vector<Range>>
				adjustRenameRanges(llvm::StringRef DraftCode, llvm::StringRef Identifier,
				std::vector<Range> Indexed, const LangOptions &LangOpts);

				/// Calculates the lexed occurrences that the given indexed occurrences map to.
				/// Returns an error if we don't find a mapping.
				///
				/// Exposed for testing only.
				sammccallUnsubmitted Done Reply Inline Actions "we use the best near miss" - it would help to describe approximately what this means. It's very hard to understand that this can fail because no candidate is "near". e.g. "we attempt to map the indexed locations onto candidates in a plausible way (e.g. guess that lines were inserted). If such a "near miss" is found, the rename is still possible" sammccall: "we use the best near miss" - it would help to describe approximately what this means. It's…
				///
				/// REQUIRED: Indexed and Lexed are sorted.
				sammccallUnsubmitted Done Reply Inline Actions nit: occcurrences -> occurences nit: break before ( sammccall: nit: occcurrences -> occurences nit: break before (
				llvm::Expected<std::vector<Range>> getMappedRanges(ArrayRef<Range> Indexed,
				ArrayRef<Range> Lexed);
				sammccallUnsubmitted Done Reply Inline Actions I think this use of "patch" is a little confusing, and the name is a bit vague for the clangd namespace. `adjustRenameRanges`? sammccall: I think this use of "patch" is a little confusing, and the name is a bit vague for the clangd…
				/// Evaluates how good the mapped result is. 0 indicates a perfect match.
				/// REQUIRED: Indexed and Mapped are sorted, and have the same size.
				size_t renameRangeAdjustmentCost(ArrayRef<Range> Indexed,
				sammccallUnsubmitted Done Reply Inline Actions also exposed for testing only? sammccall: also exposed for testing only?
				ArrayRef<Range> Mapped);

				sammccallUnsubmitted Done Reply Inline Actions "the mapping result" doesn't make sense in this context (the header), as you haven't exposed/described the function that computes mappings or otherwise defined the concept. (I think exposing the function would be best both for clarity and for unit-tests) sammccall: "the mapping result" doesn't make sense in this context (the header), as you haven't…
	} // namespace clangd			} // namespace clangd
	} // namespace clang			} // namespace clang
				sammccallUnsubmitted Done Reply Inline Actions I'd suggest moving most of this comment (at least the examples) to the cpp file. sammccall: I'd suggest moving most of this comment (at least the examples) to the cpp file.

	#endif // LLVM_CLANG_TOOLS_EXTRA_CLANGD_REFACTOR_RENAME_H			#endif // LLVM_CLANG_TOOLS_EXTRA_CLANGD_REFACTOR_RENAME_H
				sammccallUnsubmitted Done Reply Inline Actions Oops, forgot this... I think the public API isn't quite right here - exposing parts for testing is fine, but the classification itself isn't fine grained enough I think. (Too easy to write a test that "passes" but the actual mapping found isn't the right one). And the class structure wrapping a LangOpts ref seems like a detail that can be hidden. I'd like to see: a function that returns the lexed ranges from a StringRef/LangOpts a function that constructs the mapping given two sequences of ranges (like `getMappedRanges(ArrayRef<Range>, ArrayRef<Range>) -> vector<Range>` a function that ties these together to the data structures we care about (i.e. taking Code + identifier + LangOpts + ArrayRef<Ref> or so) then you can unit test the first and second and smoke test the third. Tests like Indexed = "int [[x]] = 0; void foo(int x);"; Draft = "double [[x]] = 0; void foo(double x);"; verifyRenameMatches(Indexed, Draft); sammccall: Oops, forgot this... I think the public API isn't quite right here - exposing parts for testing…
				hokeinAuthorUnsubmitted Done Reply Inline Actions a function that returns the lexed ranges from a StringRef/LangOpts There is an existing function `collectIdentifierRanges` in SourceCode.cpp, and it has been unittested. a function that constructs the mapping given two sequences of ranges (like getMappedRanges(ArrayRef<Range>, ArrayRef<Range>) -> vector<Range> a function that ties these together to the data structures we care about (i.e. taking Code + identifier + LangOpts + ArrayRef<Ref> or so) sure, I think it is sufficient to test the second one, since the second one is a simple wrapper of the `getMappedRanges`. hokein: > a function that returns the lexed ranges from a StringRef/LangOpts There is an existing…
				sammccallUnsubmitted Not Done Reply Inline Actions sure, I think it is sufficient to test the second one, since the second one is a simple wrapper of the `getMappedRanges`. Did you mean "sufficient to test the first one"? Testing the second one is certainly sufficient, but tests more than it needs to (particularly the lexing bits again). sammccall: > sure, I think it is sufficient to test the second one, since the second one is a simple…
				hokeinAuthorUnsubmitted Done Reply Inline Actions I got the point here, exposed `getMappedRanges` and added unit tests for it. hokein: I got the point here, exposed `getMappedRanges` and added unit tests for it.
				sammccallUnsubmitted Done Reply Inline Actions `editCost` is again not a specific enough name for this operation. The cost we're describing is specific to adjusting the indexed ranges to correspond to the mapped ones. Maybe `renameRangeAdjustmentCost()`? sammccall: `editCost` is again not a specific enough name for this operation. The cost we're describing…

clang-tools-extra/clangd/refactor/Rename.cpp

Show All 15 Lines
#include "index/SymbolCollector.h"		#include "index/SymbolCollector.h"
#include "clang/AST/DeclCXX.h"		#include "clang/AST/DeclCXX.h"
#include "clang/AST/DeclTemplate.h"		#include "clang/AST/DeclTemplate.h"
#include "clang/Basic/SourceLocation.h"		#include "clang/Basic/SourceLocation.h"
#include "clang/Tooling/Refactoring/Rename/USRFindingAction.h"		#include "clang/Tooling/Refactoring/Rename/USRFindingAction.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/Error.h"		#include "llvm/Support/Error.h"
#include "llvm/Support/FormatVariadic.h"		#include "llvm/Support/FormatVariadic.h"
		#include <algorithm>

namespace clang {		namespace clang {
namespace clangd {		namespace clangd {
namespace {		namespace {

llvm::Optional<std::string> filePath(const SymbolLocation &Loc,		llvm::Optional<std::string> filePath(const SymbolLocation &Loc,
llvm::StringRef HintFilePath) {		llvm::StringRef HintFilePath) {
if (!Loc)		if (!Loc)
▲ Show 20 Lines • Show All 318 Lines • ▼ Show 20 Lines	llvm::Expected<FileEdits> renameOutsideFile(
for (auto &FileAndOccurrences : *AffectedFiles) {		for (auto &FileAndOccurrences : *AffectedFiles) {
llvm::StringRef FilePath = FileAndOccurrences.first();		llvm::StringRef FilePath = FileAndOccurrences.first();

auto AffectedFileCode = GetFileContent(FilePath);		auto AffectedFileCode = GetFileContent(FilePath);
if (!AffectedFileCode) {		if (!AffectedFileCode) {
elog("Fail to read file content: {0}", AffectedFileCode.takeError());		elog("Fail to read file content: {0}", AffectedFileCode.takeError());
continue;		continue;
}		}
auto RenameEdit =		auto RenameCandidates =
		sammccallUnsubmitted Done Reply Inline Actions nit: these are not candidates, they are `RenameRanges` or similar sammccall: nit: these are not candidates, they are `RenameRanges` or similar
buildRenameEdit(FilePath, *AffectedFileCode,		adjustRenameRanges(*AffectedFileCode, RenameDecl.getNameAsString(),
std::move(FileAndOccurrences.second), NewName);		std::move(FileAndOccurrences.second),
		RenameDecl.getASTContext().getLangOpts());
		if (!RenameCandidates) {
		return llvm::make_error<llvm::StringError>(
		sammccallUnsubmitted Done Reply Inline Actions This is worth a comment (because the error message returned describes a condition that we usually recover from) sammccall: This is worth a comment (because the error message returned describes a condition that we…
		llvm::formatv("Index results don't match the content of file {0} "
		"(the index may be stale), details: {0}",
		FilePath, llvm::toString(RenameCandidates.takeError())),
		llvm::inconvertibleErrorCode());
		}
		auto RenameEdit = buildRenameEdit(FilePath, *AffectedFileCode,
		*RenameCandidates, NewName);
if (!RenameEdit) {		if (!RenameEdit) {
return llvm::make_error<llvm::StringError>(		return llvm::make_error<llvm::StringError>(
llvm::formatv("fail to build rename edit for file {0}: {1}", FilePath,		llvm::formatv("fail to build rename edit for file {0}: {1}", FilePath,
llvm::toString(RenameEdit.takeError())),		llvm::toString(RenameEdit.takeError())),
llvm::inconvertibleErrorCode());		llvm::inconvertibleErrorCode());
}		}
if (!RenameEdit->Replacements.empty())		if (!RenameEdit->Replacements.empty())
Results.insert({FilePath, std::move(*RenameEdit)});		Results.insert({FilePath, std::move(*RenameEdit)});
}		}
return Results;		return Results;
}		}

		// A simple edit is eithor changing line or column, but not both.
		sammccallUnsubmitted Done Reply Inline Actions the name here isn't much simpler/clearer than the code. `impliesSimpleEdit`? sammccall: the name here isn't much simpler/clearer than the code. `impliesSimpleEdit`?
		bool impliesSimpleEdit(const Position &LHS, const Position &RHS) {
		return LHS.line == RHS.line \|\| LHS.character == RHS.character;
		}

		sammccallUnsubmitted Done Reply Inline Actions I think this is just return std::includes(Superset.begin, Superset.end(), Subset.begin(), Subset.end()); (probably clear enough to just inline to the callsite) sammccall: I think this is just return std::includes(Superset.begin, Superset.end(), Subset.begin()…
		// Performs a DFS to enumerate all possible near-miss matches.
		// It finds the locations where the indexed occurrences are now spelled in
		// Lexed occurrences, a near miss is defined as:
		// - a near miss maps all of the name occurrences from the index onto a
		// subset of lexed occurrences (we allow a single name refers to more
		// than one symbol)
		// - all indexed occurrences must be mapped, and Result must be distinct and
		// preseve order (only support detecting simple edits to ensure a
		// robust mapping)
		// - each indexed -> lexed occurrences mapping correspondence may change the
		// line or column, but not both (increases chance of a robust mapping)
		void findNearMiss(
		std::vector<int> &PartialMatch, ArrayRef<Range> IndexedRest,
		ArrayRef<Range> LexedRest, int LexedIndex, int &Fuel,
		llvm::function_ref<void(const std::vector<int> &)> MatchedCB) {
		if (--Fuel < 0)
		return;
		if (IndexedRest.size() > LexedRest.size())
		return;
		if (IndexedRest.empty()) {
		MatchedCB(PartialMatch);
		return;
		}
		if (impliesSimpleEdit(IndexedRest.front().start, LexedRest.front().start)) {
		PartialMatch.push_back(LexedIndex);
		findNearMiss(PartialMatch, IndexedRest.drop_front(), LexedRest.drop_front(),
		LexedIndex + 1, Fuel, MatchedCB);
		PartialMatch.pop_back();
		}
		findNearMiss(PartialMatch, IndexedRest, LexedRest.drop_front(),
		LexedIndex + 1, Fuel, MatchedCB);
		}

} // namespace		} // namespace

llvm::Expected<FileEdits> rename(const RenameInputs &RInputs) {		llvm::Expected<FileEdits> rename(const RenameInputs &RInputs) {
ParsedAST &AST = RInputs.AST;		ParsedAST &AST = RInputs.AST;
const SourceManager &SM = AST.getSourceManager();		const SourceManager &SM = AST.getSourceManager();
llvm::StringRef MainFileCode = SM.getBufferData(SM.getMainFileID());		llvm::StringRef MainFileCode = SM.getBufferData(SM.getMainFileID());
auto GetFileContent = [&RInputs,		auto GetFileContent = [&RInputs,
&SM](PathRef AbsPath) -> llvm::Expected<std::string> {		&SM](PathRef AbsPath) -> llvm::Expected<std::string> {
llvm::Optional<std::string> DirtyBuffer;		llvm::Optional<std::string> DirtyBuffer;
if (RInputs.GetDirtyBuffer &&		if (RInputs.GetDirtyBuffer &&
(DirtyBuffer = RInputs.GetDirtyBuffer(AbsPath)))		(DirtyBuffer = RInputs.GetDirtyBuffer(AbsPath)))
return std::move(*DirtyBuffer);		return std::move(*DirtyBuffer);

auto Content =		auto Content =
SM.getFileManager().getVirtualFileSystem().getBufferForFile(AbsPath);		SM.getFileManager().getVirtualFileSystem().getBufferForFile(AbsPath);
if (!Content)		if (!Content)
return llvm::createStringError(		return llvm::createStringError(
llvm::inconvertibleErrorCode(),		llvm::inconvertibleErrorCode(),
llvm::formatv("Fail to open file {0}: {1}", AbsPath,		llvm::formatv("Fail to open file {0}: {1}", AbsPath,
Content.getError().message()));		Content.getError().message()));
if (!*Content)		if (!*Content)
		sammccallUnsubmitted Done Reply Inline Actions it looks like itercount at the moment is counting depth and ignoring breadth. I think you want to pass by reference and increment on each call, instead. sammccall: it looks like itercount at the moment is counting depth and ignoring breadth. I think you want…
		sammccallUnsubmitted Done Reply Inline Actions FWIW, this seems like a class that could be a function: params are: vector<int> &PartialMatch ArrayRef<Range> IndexedRest // just those not covered by PartialMatch ArrayRef<Range> MatchedRest // those still to consider int &Fuel // set to 10000, decrement, bail out once it goes negative Callback // as now, though no need for the parameter sammccall: FWIW, this seems like a class that could be a function: params are: ``` vector<int>…
		sammccallUnsubmitted Done Reply Inline Actions you can bail out as soon as there are not enough lexed tokens remaining to match sammccall: you can bail out as soon as there are not enough lexed tokens remaining to match
return llvm::createStringError(		return llvm::createStringError(
llvm::inconvertibleErrorCode(),		llvm::inconvertibleErrorCode(),
llvm::formatv("Got no buffer for file {0}", AbsPath));		llvm::formatv("Got no buffer for file {0}", AbsPath));

return (*Content)->getBuffer().str();		return (*Content)->getBuffer().str();
};		};
SourceLocation SourceLocationBeg = SM.getMacroArgExpandedLocation(		SourceLocation SourceLocationBeg = SM.getMacroArgExpandedLocation(
getBeginningOfIdentifier(RInputs.Pos, SM, AST.getLangOpts()));		getBeginningOfIdentifier(RInputs.Pos, SM, AST.getLangOpts()));
// FIXME: Renaming macros is not supported yet, the macro-handling code should		// FIXME: Renaming macros is not supported yet, the macro-handling code should
// be moved to rename tooling library.		// be moved to rename tooling library.
		sammccallUnsubmitted Done Reply Inline Actions if we're actually evaluating all ranges, can we pass the index array (by reference), use it to evaluate scores, and only copy ranges for the winner? sammccall: if we're actually evaluating all ranges, can we pass the index array (by reference), use it to…
		hokeinAuthorUnsubmitted Done Reply Inline Actions we could use the index array to evaluate the scores, but it would make the cost API signature a bit weird, like `size_t renameRangeAdjustmentCost(ArrayRef<Range> Indexed, ArrayRef<Range> Lexed, ArrayRef<size_t> MatchedLexedIndex);` hokein: we could use the index array to evaluate the scores, but it would make the cost API signature a…
		sammccallUnsubmitted Not Done Reply Inline Actions It's not really an API right, just a helper function exposed for testing? I don't think this is a a problem. sammccall: It's not really an API right, just a helper function exposed for testing? I don't think this is…
if (locateMacroAt(SourceLocationBeg, AST.getPreprocessor()))		if (locateMacroAt(SourceLocationBeg, AST.getPreprocessor()))
return makeError(ReasonToReject::UnsupportedSymbol);		return makeError(ReasonToReject::UnsupportedSymbol);

auto DeclsUnderCursor = locateDeclAt(AST, SourceLocationBeg);		auto DeclsUnderCursor = locateDeclAt(AST, SourceLocationBeg);
		sammccallUnsubmitted Done Reply Inline Actions This works but maybe more common/obvious is to use recursion for both cases instead of the loop: if (isLineOrColumnEqual(..., Lexed[NextLexed])){ // match ... enumerate(NextMatched+1, NextLexed+1); } // don't match enumerate(NextMatched, NextLexed+1); sammccall: This works but maybe more common/obvious is to use recursion for both cases instead of the loop…
if (DeclsUnderCursor.empty())		if (DeclsUnderCursor.empty())
		sammccallUnsubmitted Done Reply Inline Actions Visited seems redundant: it always contains [0, NextLexed) sammccall: Visited seems redundant: it always contains [0, NextLexed)
return makeError(ReasonToReject::NoSymbolFound);		return makeError(ReasonToReject::NoSymbolFound);
if (DeclsUnderCursor.size() > 1)		if (DeclsUnderCursor.size() > 1)
return makeError(ReasonToReject::AmbiguousSymbol);		return makeError(ReasonToReject::AmbiguousSymbol);

const auto RenameDecl = llvm::dyn_cast<NamedDecl>(DeclsUnderCursor.begin());		const auto RenameDecl = llvm::dyn_cast<NamedDecl>(DeclsUnderCursor.begin());
if (!RenameDecl)		if (!RenameDecl)
return makeError(ReasonToReject::UnsupportedSymbol);		return makeError(ReasonToReject::UnsupportedSymbol);

▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines	for (const auto &R : OccurrencesOffsets) {
auto ByteLength = R.second - R.first;		auto ByteLength = R.second - R.first;
if (auto Err = RenameEdit.add(		if (auto Err = RenameEdit.add(
tooling::Replacement(AbsFilePath, R.first, ByteLength, NewName)))		tooling::Replacement(AbsFilePath, R.first, ByteLength, NewName)))
return std::move(Err);		return std::move(Err);
}		}
return Edit(InitialCode, std::move(RenameEdit));		return Edit(InitialCode, std::move(RenameEdit));
}		}

		// Details:
		// - lex the draft code to get all rename candidates, this yields a superset
		// of candidates.
		// - apply range patching heuristics to generate "authoritative" occurrences,
		// cases we consider:
		// (a) index returns a subset of candidates, we use the indexed results.
		// - fully equal, we are sure the index is up-to-date
		// - proper subset, index is correct in most cases? there may be false
		// positives (e.g. candidates got appended), but rename is still safe
		// (b) index returns non-candidate results, we attempt to map the indexed
		// ranges onto candidates in a plausible way (e.g. guess that lines
		// were inserted). If such a "near miss" is found, the rename is still
		sammccallUnsubmitted Not Done Reply Inline Actions That doesn't mean you need std::move to get a move, it just means you can't avoid calling the move constructor. https://godbolt.org/z/Rh-CvT sammccall: That doesn't mean you need std::move to get a move, it just means you can't avoid calling the…
		// possible
		llvm::Expected<std::vector<Range>>
		sammccallUnsubmitted Done Reply Inline Actions why returning Expected rather than Optional here - what do we want to do with the message? sammccall: why returning Expected rather than Optional here - what do we want to do with the message?
		hokeinAuthorUnsubmitted Done Reply Inline Actions I found these error messages are useful for debugging. But you are right, end users are not interested in them, changed to vlog and Optional here. hokein: I found these error messages are useful for debugging. But you are right, end users are not…
		adjustRenameRanges(llvm::StringRef DraftCode, llvm::StringRef Identifier,
		std::vector<Range> Indexed, const LangOptions &LangOpts) {
		assert(!Indexed.empty());
		std::vector<Range> Lexed =
		collectIdentifierRanges(Identifier, DraftCode, LangOpts);
		sammccallUnsubmitted Done Reply Inline Actions Not clear to me why we're building these structures. The cost is a sum of implied edits, implied edits are a function of (last displacement, current displacement, are they on the same line) so what about something like: Cost = 0; LastLine = -1; LastDX = 0, LastDY = 0; for (I) { DX = Mapped[I].begin.character - Indexed[I].begin.character; DY = Mapped[I].begin.line - Indexed[I].begin.line; Line = Mapped[I].begin.line; if (!(Line == LastLine && DX == LastDX)) LastDX = 0; // horizontal offsets don't carry across lines Cost += abs(DX - LastDX) + abs(DY - LastDY); LastDX, LastDY, LastLine = DX, DY, Line; } sammccall: Not clear to me why we're building these structures. The cost is a sum of implied edits…
		llvm::sort(Indexed);
		llvm::sort(Lexed);
		return getMappedRanges(Indexed, Lexed);
		}

		llvm::Expected<std::vector<Range>> getMappedRanges(ArrayRef<Range> Indexed,
		ArrayRef<Range> Lexed) {
		assert(!Indexed.empty());
		assert(std::is_sorted(Indexed.begin(), Indexed.end()));
		assert(std::is_sorted(Lexed.begin(), Lexed.end()));

		if (Indexed.size() > Lexed.size())
		return llvm::make_error<llvm::StringError>(
		llvm::inconvertibleErrorCode(),
		llvm::formatv("The number of lexed occurrences is less than indexed "
		sammccallUnsubmitted Done Reply Inline Actions This message isn't meaningful outside this TU - it should be a vlog, or easier to understand sammccall: This message isn't meaningful outside this TU - it should be a vlog, or easier to understand
		"occurrences"));
		// Fast check for the special subset case.
		if (std::includes(Indexed.begin(), Indexed.end(), Lexed.begin(), Lexed.end()))
		return std::vector<Range>{Indexed.begin(), Indexed.end()};
		sammccallUnsubmitted Done Reply Inline Actions return Indexed.vec() sammccall: return Indexed.vec()

		std::vector<Range> Best;
		size_t BestCost = std::numeric_limits<size_t>::max();
		bool HasMultiple = 0;
		std::vector<int> MatchedLexedIndex;
		int Fuel = 10000;
		findNearMiss(MatchedLexedIndex, Indexed, Lexed, 0, Fuel,
		[&](const std::vector<int> &Matched) {
		std::vector<Range> Mapped;
		for (int I : MatchedLexedIndex)
		Mapped.push_back(Lexed[I]);
		size_t MCost = renameRangeAdjustmentCost(Indexed, Mapped);
		if (MCost < BestCost) {
		BestCost = MCost;
		Best = std::move(Mapped);
		HasMultiple = false; // reset
		return;
		}
		if (MCost == BestCost)
		HasMultiple = true;
		});
		if (HasMultiple)
		return llvm::make_error<llvm::StringError>(
		llvm::inconvertibleErrorCode(),
		llvm::formatv("The best near miss is not distinct"));
		sammccallUnsubmitted Done Reply Inline Actions distinct -> unique sammccall: distinct -> unique
		if (Best.empty())
		return llvm::make_error<llvm::StringError>(
		llvm::inconvertibleErrorCode(),
		llvm::formatv("Didn't found a near miss"));
		sammccallUnsubmitted Done Reply Inline Actions found -> find sammccall: found -> find
		sammccallUnsubmitted Done Reply Inline Actions (again, these error messages are not useful to the user, as-is) sammccall: (again, these error messages are not useful to the user, as-is)
		return Best;
		}

		// The cost is the sum of the implied edit sizes between successive diffs, only
		// simple edits are considered:
		// - insert/remove a line (change line offset)
		// - insert/remove a character on an existing line (change column offset)
		//
		// Example I, total result is 1 + 1 = 2.
		// diff[0]: line + 1 <- insert a line before edit 0.
		// diff[1]: line + 1
		// diff[2]: line + 1
		// diff[3]: line + 2 <- insert a line before edits 2 and 3.
		//
		// Example II, total result is 1 + 1 + 1 = 3.
		// diff[0]: line + 1 <- insert a line before edit 0.
		// diff[1]: column + 1 <- remove a line between edits 0 and 1, and insert a
		// character on edit 1.
		size_t renameRangeAdjustmentCost(ArrayRef<Range> Indexed,
		ArrayRef<Range> Mapped) {
		assert(Indexed.size() == Mapped.size());
		assert(std::is_sorted(Indexed.begin(), Indexed.end()));
		assert(std::is_sorted(Mapped.begin(), Mapped.end()));

		int LastLine = -1;
		int LastDLine = 0, LastDColumn = 0;
		int Cost = 0;
		for (size_t I = 0; I < Indexed.size(); ++I) {
		int DLine = Indexed[I].start.line - Mapped[I].start.line;
		int DColumn = Indexed[I].start.character - Mapped[I].start.character;
		int Line = Indexed[I].start.line;
		if (Line != LastLine)
		LastDColumn = 0; // colmun offsets don't carry cross lines.
		Cost += abs(DLine - LastDLine) + abs(DColumn - LastDColumn);
		std::tie(LastLine, LastDLine, LastDColumn) = std::tie(Line, DLine, DColumn);
		}
		return Cost;
		}

} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang

clang-tools-extra/clangd/unittests/RenameTests.cpp

//===-- RenameTests.cpp ------------------------------------------ C++ --===//		//===-- RenameTests.cpp ------------------------------------------ C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "Annotations.h"		#include "Annotations.h"
#include "ClangdServer.h"		#include "ClangdServer.h"
#include "SyncAPI.h"		#include "SyncAPI.h"
#include "TestFS.h"		#include "TestFS.h"
#include "TestTU.h"		#include "TestTU.h"
#include "index/Ref.h"		#include "index/Ref.h"
#include "refactor/Rename.h"		#include "refactor/Rename.h"
#include "clang/Tooling/Core/Replacement.h"		#include "clang/Tooling/Core/Replacement.h"
		#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/MemoryBuffer.h"		#include "llvm/Support/MemoryBuffer.h"
#include "gmock/gmock.h"		#include "gmock/gmock.h"
#include "gtest/gtest.h"		#include "gtest/gtest.h"

namespace clang {		namespace clang {
namespace clangd {		namespace clangd {
namespace {		namespace {

▲ Show 20 Lines • Show All 823 Lines • ▼ Show 20 Lines	[[range]]
[[range]]		[[range]]
)cpp");		)cpp");
Edit = buildRenameEdit(FilePath, T.code(), T.ranges(), "abc");		Edit = buildRenameEdit(FilePath, T.code(), T.ranges(), "abc");
ASSERT_TRUE(bool(Edit)) << Edit.takeError();		ASSERT_TRUE(bool(Edit)) << Edit.takeError();
EXPECT_EQ(applyEdits(FileEdits{{T.code(), std::move(*Edit)}}).front().second,		EXPECT_EQ(applyEdits(FileEdits{{T.code(), std::move(*Edit)}}).front().second,
expectedResult(Code, expectedResult(T, "abc")));		expectedResult(Code, expectedResult(T, "abc")));
}		}

		TEST(CrossFileRenameTests, adjustRenameRanges) {
		// Ranges in IndexedCode indicate the indexed occurrences;
		// ranges in DraftCode indicate the expected mapped result, empty indicates
		// we expect no matched result found.
		struct {
		llvm::StringRef IndexedCode;
		llvm::StringRef DraftCode;
		} Tests[] = {
		{
		// both line and column are changed, not a near miss.
		sammccallUnsubmitted Done Reply Inline Actions can you make the new line non-empty (add a comment) and change int->double instead of adding whitespace? Was hard to see what's going on here sammccall: can you make the new line non-empty (add a comment) and change int->double instead of adding…
		R"cpp(
		int [[x]] = 0;
		)cpp",
		R"cpp(

		int x = 0;
		)cpp",
		},
		{
		// subset.
		R"cpp(
		int [[x]] = 0;
		)cpp",
		R"cpp(
		int [[x]] = 0;
		{int x = 0; }
		)cpp",
		},
		{
		// shift columns.
		R"cpp(int [[x]] = 0; void foo(int x);)cpp",
		R"cpp(double [[x]] = 0; void foo(double x);)cpp",
		},
		{
		// insert a line.
		R"cpp(
		int [[x]] = 0;
		void foo(int x);
		)cpp",
		R"cpp(
		//
		int [[x]] = 0;
		void foo(int x);
		)cpp",
		},
		};
		LangOptions LangOpts;
		LangOpts.CPlusPlus = true;
		for (const auto &T : Tests) {
		Annotations Draft(T.DraftCode);
		auto ActualRanges = adjustRenameRanges(
		Draft.code(), "x", Annotations(T.IndexedCode).ranges(), LangOpts);
		if (!Draft.ranges().empty()) {
		EXPECT_TRUE((bool)ActualRanges) << "patchRange returned an error: "
		<< llvm::toString(ActualRanges.takeError());
		EXPECT_THAT(Draft.ranges(),
		testing::UnorderedElementsAreArray(*ActualRanges))
		<< T.DraftCode;
		} else {
		EXPECT_FALSE(ActualRanges)
		<< "expected an error: " << T.DraftCode;
		llvm::consumeError(ActualRanges.takeError());
		}
		}
		}

		TEST(RangePatchingHeuristic, GetMappedRanges) {
		// ^ in LexedCode marks the ranges we expect to be mapped; no ^ indicates
		// there are no mapped ranges.
		struct {
		llvm::StringRef IndexedCode;
		llvm::StringRef LexedCode;
		} Tests[] = {
		{
		// no lexed ranges.
		"[[]]",
		"",
		},
		{
		// both line and column are changed, not a near miss.
		R"([[]])",
		R"(
		[[]]
		)",
		},
		{
		// subset.
		"[[]]",
		"^[[]] [[]]"
		},
		{
		// shift columns.
		"[[]] [[]]",
		" ^[[]] ^[[]] [[]]"
		},
		{
		R"(
		[[]]

		[[]] [[]]
		)",
		R"(
		// insert a line
		^[[]]

		^[[]] ^[[]]
		)",
		},
		{
		R"(
		[[]]

		[[]] [[]]
		)",
		R"(
		// insert a line
		^[[]]
		^[[]] ^[[]] // column is shifted.
		)",
		},
		{
		R"(
		[[]]

		[[]] [[]]
		)",
		R"(
		// insert a line
		[[]]

		[[]] [[]] // not mapped (both line and column are changed).
		)",
		},
		{
		R"(
		[[]]
		[[]]

		[[]]
		[[]]

		}
		)",
		R"(
		// insert a new line
		^[[]]
		^[[]]
		[[]] // additional range
		^[[]]
		^[[]]
		[[]] // additional range
		)",
		},
		{
		// non-distinct result (two best results), not a near miss
		R"(
		[[]]
		[[]]
		[[]]
		)",
		R"(
		[[]]
		[[]]
		[[]]
		[[]]
		)",
		}
		};
		for (const auto &T : Tests) {
		auto Lexed = Annotations(T.LexedCode);
		auto LexedRanges = Lexed.ranges();
		std::vector<Range> ExpectedMatches;
		for (auto P : Lexed.points()) {
		auto Match = llvm::find_if(LexedRanges, [&P](const Range& R) {
		return R.start == P;
		});
		ASSERT_NE(Match, LexedRanges.end());
		ExpectedMatches.push_back(*Match);
		}

		auto Mapped =
		getMappedRanges(Annotations(T.IndexedCode).ranges(), LexedRanges);
		if (!ExpectedMatches.empty()) {
		EXPECT_TRUE((bool)Mapped) << "getMappedRanges returned an error: "
		<< llvm::toString(Mapped.takeError());
		EXPECT_THAT(ExpectedMatches, testing::UnorderedElementsAreArray(*Mapped))
		<< T.IndexedCode;
		} else {
		EXPECT_FALSE(Mapped) << "expected an error: " << T.IndexedCode;
		llvm::consumeError(Mapped.takeError());
		}
		}
		}

		TEST(CrossFileRenameTests, adjustmentCost) {
		struct {
		llvm::StringRef RangeCode;
		size_t ExpectedCost;
		} Tests[] = {
		{
		R"(
		$idx[[]]$lex[[]] // diff: 0
		)",
		0,
		},
		{
		R"(
		$idx[[]]
		$lex[[]] // line diff: +1
		$idx[[]]
		$lex[[]] // line diff: +1
		$idx[[]]
		$lex[[]] // line diff: +1

		$idx[[]]

		$lex[[]] // line diff: +2
		)",
		1 + 1
		},
		{
		R"(
		$idx[[]]
		$lex[[]] // line diff: +1
		$idx[[]]

		$lex[[]] // line diff: +2
		$idx[[]]


		$lex[[]] // line diff: +3
		)",
		1 + 1 + 1
		},
		{
		R"(
		$idx[[]]


		$lex[[]] // line diff: +3
		$idx[[]]

		$lex[[]] // line diff: +2
		$idx[[]]
		$lex[[]] // line diff: +1
		)",
		3 + 1 + 1
		},
		{
		R"(
		$idx[[]]
		$lex[[]] // line diff: +1
		$lex[[]] // line diff: -2

		$idx[[]]
		$idx[[]]


		$lex[[]] // line diff: +3
		)",
		1 + 3 + 5
		},
		{
		R"(
		$idx[[]] $lex[[]] // column diff: +1
		$idx[[]]$lex[[]] // diff: 0
		)",
		1
		},
		{
		R"(
		$idx[[]]
		$lex[[]] // diff: +1
		$idx[[]] $lex[[]] // column diff: +1
		$idx[[]]$lex[[]] // diff: 0
		)",
		1 + 1 + 1
		},
		{
		R"(
		$idx[[]] $lex[[]] // column diff: +1
		)",
		1
		},
		{
		R"(
		// column diffs: +1, +2, +3
		$idx[[]] $lex[[]] $idx[[]] $lex[[]] $idx[[]] $lex[[]]
		)",
		1 + 1 + 1,
		},
		};
		for (const auto &T : Tests) {
		Annotations C(T.RangeCode);
		EXPECT_EQ(renameRangeAdjustmentCost(C.ranges("idx"), C.ranges("lex")),
		T.ExpectedCost)
		<< T.RangeCode;
		}
		}

} // namespace		} // namespace
} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang

This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Implement range patching heuristics for cross-file rename.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 232744

clang-tools-extra/clangd/refactor/Rename.h

clang-tools-extra/clangd/refactor/Rename.cpp

clang-tools-extra/clangd/unittests/RenameTests.cpp

[clangd] Implement range patching heuristics for cross-file rename.
ClosedPublic