This is an archive of the discontinued LLVM Phabricator instance.

Fix overlapping replacements in clang-tidy.
ClosedPublic

Authored by angelgarcia on Oct 7 2015, 10:02 AM.

Download Raw Diff

Details

Reviewers

klimek
bkramer

Commits

rG166935764b11: Fix overlapping replacements in clang-tidy.
rCTE250509: Fix overlapping replacements in clang-tidy.
rL250509: Fix overlapping replacements in clang-tidy.

Summary

Prevent clang-tidy from applying fixes to errors that overlap with other errors' fixes, with one exception: if one fix is completely contained inside another one, then we can apply the big one.

Diff Detail

Event Timeline

angelgarcia updated this revision to Diff 36758.Oct 7 2015, 10:02 AM

angelgarcia retitled this revision from to Fix overlapping replacements in clang-tidy..

angelgarcia updated this object.

angelgarcia added reviewers: klimek, bkramer.

angelgarcia added subscribers: alexfh, cfe-commits.

klimek added inline comments.Oct 13 2015, 2:22 AM

clang-tidy/ClangTidyDiagnosticConsumer.cpp
417–418	These need to be documented.
420	I'd name this Queue instead (reading later I had no idea what this was).
432–435	Why are you calling this "Sit"?
474	Why do we need to sort?

Fix comments.

These need to be documented.

Done.

I'd name this Queue instead (reading later I had no idea what this was).

Done.

Why are you calling this "Sit"?

I didn't even know how to describe this variable without using examples,
and naming it is harder. More or less, it keeps track of the different
overlapping situations that have been spotted during the process, so "Sit"
stands for that. But I know it is a really poor name and I'd like to change
it, any ideas?

Why do we need to sort?

It's a remnant of a different approach that I tried before. Removed.

clang-tidy/ClangTidyDiagnosticConsumer.cpp
437–439	The description makes it unclear whether the indices stand for the set, or the dimensions stand for the sets (I can see it's the dimensions from where it is set, and that the indices are bools, but that's unclear without that / hard to grasp). Here's a slightly denormalized way, that uses early-exit and might be easier to understand (more opinions would be nice, though). (note that I probably messed up the FirstInsideSecond setting somewhere :) bool Overlaps = false; bool FirstInsideSecond = false; // In the loop: if (Count[0] != 0 && Count[1] != 0) { Overlaps = true; } else if (Overlaps && ((Count[0] != 0 && FirstInsideSecond) \|\| (Count[1] != 0 && !FirstInsideSecond)) { return OK_Overlap; } else if (Count[0] != 0 \|\| Count[1] != 0) { FirstInsideSecond = Count[1] != 0; } // After the loop: if (!Overlaps) return OK_Disjoint; return FirstInsideSecond ? OK_FirstInsideSecond : OK_SecondInsideFirst;

klimek added inline comments.Oct 14 2015, 3:24 AM

clang-tidy/ClangTidyDiagnosticConsumer.cpp
437–439	Ok, what I proposed doesn't work. I start to like your solution more. Perhaps we just need to document it better: a) call it Coverage b) introduce enum with Covered and Empty (or similar) c) document: Coverage[Covered][Covered]: both set 1 and set 2 cover an area; Coverage[Covered][Empty]: set 1 covers an area set 2 doesn't cover; Coverage[Empty][Covered]: set 2 covers an area set 1 doesn't cover;

Add an enum and rename "Sit" to "Coverage" to improve readability.

klimek added inline comments.Oct 14 2015, 4:36 AM

clang-tidy/ClangTidyDiagnosticConsumer.cpp
443–447	I'd just call out the 3 cases, like I suggested in my previous comment ("etc" sounds for me like there's more than 1 case left ;)
460	I'd say: // If both sets never cover the same range, there is no overlap.
471–474	I'd make that: return Coverage[Empty][Covered] ? OK_FirstInsideSecond : OK_SecondInsideFirst; (and adapt the comment)

Done.

klimek added inline comments.Oct 14 2015, 4:54 AM

clang-tidy/ClangTidyDiagnosticConsumer.cpp
494–495	I'm somewhat concerned about the quadratic runtime here, for cases where somebody messes up a {} pair in a header and we have 1000s of errors. Can we do a two step algorithm: go over all error-lists, compute bounding rectangles for each sort errors only do the squared algorithm for errors where the bounding rectangles overlap

Use bounding boxes to reduce complexity.

LG in general now, looks to me like we have very few tests though.
My favorite strategy to make sure I have enough tests is to comment out code (or do mutations) as long as the tests still pass. Then write tests that fail with the mutation, then undo the mutation.

clang-tidy/ClangTidyDiagnosticConsumer.cpp
491–492	I think a comment at the top of this function outlining the full algorithm would be nice (without lots of details).
518–520	Document what goes into the pair - I assumed 'first, second' for a moment, and then was confused.

djasper added a subscriber: djasper.Oct 14 2015, 7:23 AM

djasper added inline comments.

clang-tidy/ClangTidyDiagnosticConsumer.cpp
494–495	From a brief look, it seems like it might be slightly easier, to do what this is currently doing, but skip the bounding-box-calculation step. Simple do the same with each interval of each fix. We are just interested to efficiently calculate, which fixes overlap at all. Now, I am not sure whether this is going to be more or less efficient: It might be hard / not worth it to calculate which fixes have already been compared so we might compare the same two fixes repeatedly (once for each "touching point"). We might not need to carefully compare fixes at all in spite of their bounding boxes overlapping, e.g. two fixes A and B might have 3 entirely disjoint intervals: A, B, A. Then, we don't need to do the sweep at all.

I've done a couple of runs for each version, and these are the results (I
have clang-tidy compiled with the option "RelWithDebInfo"):

$ time clang-tidy -checks=* test.cpp -- -std=c++11

Without looking for overlaps.

Suppressed 23463 warnings (23463 in non-user code).
Use -header-filter=.* to display errors from all non-system headers.

real 2m14.572s
user 2m13.136s
sys 0m0.483s

real 2m15.103s
user 2m13.361s
sys 0m0.687s

Bounding boxes

Suppressed 23463 warnings (23463 in non-user code).
Use -header-filter=.* to display errors from all non-system headers.

real 2m14.208s
user 2m13.051s
sys 0m0.643s

real 2m16.368s
user 2m14.286s
sys 0m0.986s

Quadratic

Suppressed 23463 warnings (23463 in non-user code).
Use -header-filter=.* to display errors from all non-system headers.

real 2m15.130s
user 2m13.627s
sys 0m0.499s

real 2m15.322s
user 2m13.660s
sys 0m0.683s

The time is about the same for all three versions. Note that the first
version doesn't do any sweep at all, and the last version invariably does
(23463 choose 2) = 275244453 sweeps. The amount of time required to do this
seems to be too low compared to the time that clang-tidy takes to output
these diagnostics.
Also, the fact that all three versions take about the same time is a bit
suspicious, but I double-checked that I was doing it right (I did a small
file with would cause overlapping and I checked if the message "note: this
fix will not be applied because it overlaps with another fix" was there
before each run).

I intended to implement Daniel's idea to check out which one was more
efficient, but with these results in sight I don't think it is worth it.

There might be an even easier algorithm:

What if we put all start and end points into a single sorted list. Identical start points are sorted by decreasing end points and identical end points are sorted by increasing start points. Now do a single sweep and simply count the total number of starts vs. the total number of ends, call this C (if you encounter a start point, ++C, if you encounter an end point, --C). If you encounter a start-point and C != 0, mark the interval and corresponding Fixit as inapplicable. Similarly if you encounter an end point and C != 1, mark the fixit as inapplicable. Basically, this does a single sweep over all fixits and marks the ones as bad that have either their start point or their endpoint within a different interval.

(Take this as an additional idea, if you have an existing algorithm that you like better, I am fine with keeping that. Just wanted to give my additional thoughts).

That works pretty well (well, identical end points have to be sorted
decreasingly, if I understand correctly). I see a couple of problems
though, like equal intervals, or consecutive. But the algorithm is way more
efficient than what we currently have, I will try to make it work.

I did several mutations and the only case were a test didn't break was when I removed the sort, but it turned out that we don't need it. I changed the tests to apply the checks in both orders to ensure that a test will break if the errors are not sorted.

I am still thinking about how to complete Daniel's idea.

I cannot find a way to make Daniel's idea work with equal intervals:

In this case, fix A can be applied because B is completely contained inside
it.
A: [a, b)[c, d)
B: [a, b)

This time, we should not apply anyone:
A: [a, b)
B: [a, b)

And here they both have to be discarded again, but for a different reason:
A: [a, b)[c, d)
B: [e, f)[a, b)

The problem is that we have three completely different situations that are
the same from the point of view of interval "[a, b)". The local situation
of individual intervals doesn't provide enough information.

I implemented this, with the following addition: if several errors share
the same interval, I can still apply the biggest one that was not discarded
during the sweep. This way, the first example would work and in the two
other examples it would just apply a random one, as you suggested. But I
found a different case that also fails:
A: [a, b)[c,d)
B: [e, f) such that e < a, b < f < c.

In this case it just discards A. This is not incorrect (as long as we don't
apply both, everything is OK), but it gets away from the idea of "if two
fixes overlap, discard both", which is what Manuel said when I started
this. I don't think that changing that idea is a problem, but right now the
behavior of the tool is so diffuse that maintaining and testing it would be
a bit painful. Also, if we want to allow applying one of the fixes when
they overlap, we have a different problem, it might be better just to start
from scratch and think about how to solve that problem.

New algorithm :)

clang-tidy/ClangTidyDiagnosticConsumer.cpp
488	Perhaps call this OpenIntervals, or if you like it short, just Open.
492–493	s/have/has/

This revision is now accepted and ready to land.Oct 16 2015, 4:24 AM

Remove unused include, fix typo and rename Count to OpenIntervals.

angelgarcia closed this revision.Oct 16 2015, 4:45 AM

Revision Contents

Path

Size

clang-tidy/

ClangTidyDiagnosticConsumer.h

8 lines

ClangTidyDiagnosticConsumer.cpp

168 lines

unittests/

clang-tidy/

OverlappingReplacementsTest.cpp

116 lines

Diff 37569

clang-tidy/ClangTidyDiagnosticConsumer.h

Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines	public:

/// \brief Clears collected errors.		/// \brief Clears collected errors.
void clearErrors() { Errors.clear(); }		void clearErrors() { Errors.clear(); }

/// \brief Set the output struct for profile data.		/// \brief Set the output struct for profile data.
///		///
/// Setting a non-null pointer here will enable profile collection in		/// Setting a non-null pointer here will enable profile collection in
/// clang-tidy.		/// clang-tidy.
void setCheckProfileData(ProfileData* Profile);		void setCheckProfileData(ProfileData *Profile);
ProfileData* getCheckProfileData() const { return Profile; }		ProfileData *getCheckProfileData() const { return Profile; }

private:		private:
// Calls setDiagnosticsEngine() and storeError().		// Calls setDiagnosticsEngine() and storeError().
friend class ClangTidyDiagnosticConsumer;		friend class ClangTidyDiagnosticConsumer;

/// \brief Sets the \c DiagnosticsEngine so that Diagnostics can be generated		/// \brief Sets the \c DiagnosticsEngine so that Diagnostics can be generated
/// correctly.		/// correctly.
void setDiagnosticsEngine(DiagnosticsEngine *Engine);		void setDiagnosticsEngine(DiagnosticsEngine *Engine);
Show All 34 Lines	void HandleDiagnostic(DiagnosticsEngine::Level DiagLevel,
const Diagnostic &Info) override;		const Diagnostic &Info) override;

/// \brief Flushes the internal diagnostics buffer to the ClangTidyContext.		/// \brief Flushes the internal diagnostics buffer to the ClangTidyContext.
void finish() override;		void finish() override;

private:		private:
void finalizeLastError();		void finalizeLastError();

		void removeIncompatibleErrors(SmallVectorImpl<ClangTidyError> &Errors) const;

/// \brief Returns the \c HeaderFilter constructed for the options set in the		/// \brief Returns the \c HeaderFilter constructed for the options set in the
/// context.		/// context.
llvm::Regex* getHeaderFilter();		llvm::Regex *getHeaderFilter();

/// \brief Updates \c LastErrorRelatesToUserCode and LastErrorPassesLineFilter		/// \brief Updates \c LastErrorRelatesToUserCode and LastErrorPassesLineFilter
/// according to the diagnostic \p Location.		/// according to the diagnostic \p Location.
void checkFilters(SourceLocation Location);		void checkFilters(SourceLocation Location);
bool passesLineFilter(StringRef FileName, unsigned LineNumber) const;		bool passesLineFilter(StringRef FileName, unsigned LineNumber) const;

ClangTidyContext &Context;		ClangTidyContext &Context;
std::unique_ptr<DiagnosticsEngine> Diags;		std::unique_ptr<DiagnosticsEngine> Diags;
Show All 10 Lines

clang-tidy/ClangTidyDiagnosticConsumer.cpp

Show All 16 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "ClangTidyDiagnosticConsumer.h"		#include "ClangTidyDiagnosticConsumer.h"
#include "ClangTidyOptions.h"		#include "ClangTidyOptions.h"
#include "clang/AST/ASTDiagnostic.h"		#include "clang/AST/ASTDiagnostic.h"
#include "clang/Basic/DiagnosticOptions.h"		#include "clang/Basic/DiagnosticOptions.h"
#include "clang/Frontend/DiagnosticRenderer.h"		#include "clang/Frontend/DiagnosticRenderer.h"
#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
#include <set>
#include <tuple>		#include <tuple>
		#include <vector>
using namespace clang;		using namespace clang;
using namespace tidy;		using namespace tidy;

namespace {		namespace {
class ClangTidyDiagnosticRenderer : public DiagnosticRenderer {		class ClangTidyDiagnosticRenderer : public DiagnosticRenderer {
public:		public:
ClangTidyDiagnosticRenderer(const LangOptions &LangOpts,		ClangTidyDiagnosticRenderer(const LangOptions &LangOpts,
DiagnosticOptions *DiagOpts,		DiagnosticOptions *DiagOpts,
▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	else if (MetaChars.find(C) != StringRef::npos)
RegexText.push_back('\\');		RegexText.push_back('\\');
RegexText.push_back(C);		RegexText.push_back(C);
}		}
RegexText.push_back('$');		RegexText.push_back('$');
return llvm::Regex(RegexText);		return llvm::Regex(RegexText);
}		}

GlobList::GlobList(StringRef Globs)		GlobList::GlobList(StringRef Globs)
: Positive(!ConsumeNegativeIndicator(Globs)),		: Positive(!ConsumeNegativeIndicator(Globs)), Regex(ConsumeGlob(Globs)),
Regex(ConsumeGlob(Globs)),
NextGlob(Globs.empty() ? nullptr : new GlobList(Globs)) {}		NextGlob(Globs.empty() ? nullptr : new GlobList(Globs)) {}

bool GlobList::contains(StringRef S, bool Contains) {		bool GlobList::contains(StringRef S, bool Contains) {
if (Regex.match(S))		if (Regex.match(S))
Contains = Positive;		Contains = Positive;

if (NextGlob)		if (NextGlob)
Contains = NextGlob->contains(S, Contains);		Contains = NextGlob->contains(S, Contains);
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
const ClangTidyGlobalOptions &ClangTidyContext::getGlobalOptions() const {		const ClangTidyGlobalOptions &ClangTidyContext::getGlobalOptions() const {
return OptionsProvider->getGlobalOptions();		return OptionsProvider->getGlobalOptions();
}		}

const ClangTidyOptions &ClangTidyContext::getOptions() const {		const ClangTidyOptions &ClangTidyContext::getOptions() const {
return CurrentOptions;		return CurrentOptions;
}		}

void ClangTidyContext::setCheckProfileData(ProfileData *P) {		void ClangTidyContext::setCheckProfileData(ProfileData *P) { Profile = P; }
Profile = P;
}

GlobList &ClangTidyContext::getChecksFilter() {		GlobList &ClangTidyContext::getChecksFilter() {
assert(CheckFilter != nullptr);		assert(CheckFilter != nullptr);
return *CheckFilter;		return *CheckFilter;
}		}

/// \brief Store a \c ClangTidyError.		/// \brief Store a \c ClangTidyError.
void ClangTidyContext::storeError(const ClangTidyError &Error) {		void ClangTidyContext::storeError(const ClangTidyError &Error) {
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	if (DiagLevel == DiagnosticsEngine::Note) {
std::string CheckName = !WarningOption.empty()		std::string CheckName = !WarningOption.empty()
? ("clang-diagnostic-" + WarningOption).str()		? ("clang-diagnostic-" + WarningOption).str()
: Context.getCheckName(Info.getID()).str();		: Context.getCheckName(Info.getID()).str();

if (CheckName.empty()) {		if (CheckName.empty()) {
// This is a compiler diagnostic without a warning option. Assign check		// This is a compiler diagnostic without a warning option. Assign check
// name based on its level.		// name based on its level.
switch (DiagLevel) {		switch (DiagLevel) {
case DiagnosticsEngine::Error:		case DiagnosticsEngine::Error:
case DiagnosticsEngine::Fatal:		case DiagnosticsEngine::Fatal:
CheckName = "clang-diagnostic-error";		CheckName = "clang-diagnostic-error";
break;		break;
case DiagnosticsEngine::Warning:		case DiagnosticsEngine::Warning:
CheckName = "clang-diagnostic-warning";		CheckName = "clang-diagnostic-warning";
break;		break;
default:		default:
CheckName = "clang-diagnostic-unknown";		CheckName = "clang-diagnostic-unknown";
break;		break;
}		}
}		}

ClangTidyError::Level Level = ClangTidyError::Warning;		ClangTidyError::Level Level = ClangTidyError::Warning;
if (DiagLevel == DiagnosticsEngine::Error \|\|		if (DiagLevel == DiagnosticsEngine::Error \|\|
DiagLevel == DiagnosticsEngine::Fatal) {		DiagLevel == DiagnosticsEngine::Fatal) {
// Force reporting of Clang errors regardless of filters and non-user		// Force reporting of Clang errors regardless of filters and non-user
// code.		// code.
Show All 18 Lines	void ClangTidyDiagnosticConsumer::HandleDiagnostic(

checkFilters(Info.getLocation());		checkFilters(Info.getLocation());
}		}

bool ClangTidyDiagnosticConsumer::passesLineFilter(StringRef FileName,		bool ClangTidyDiagnosticConsumer::passesLineFilter(StringRef FileName,
unsigned LineNumber) const {		unsigned LineNumber) const {
if (Context.getGlobalOptions().LineFilter.empty())		if (Context.getGlobalOptions().LineFilter.empty())
return true;		return true;
for (const FileFilter& Filter : Context.getGlobalOptions().LineFilter) {		for (const FileFilter &Filter : Context.getGlobalOptions().LineFilter) {
if (FileName.endswith(Filter.Name)) {		if (FileName.endswith(Filter.Name)) {
if (Filter.LineRanges.empty())		if (Filter.LineRanges.empty())
return true;		return true;
for (const FileFilter::LineRange &Range : Filter.LineRanges) {		for (const FileFilter::LineRange &Range : Filter.LineRanges) {
if (Range.first <= LineNumber && LineNumber <= Range.second)		if (Range.first <= LineNumber && LineNumber <= Range.second)
return true;		return true;
}		}
return false;		return false;
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines

llvm::Regex *ClangTidyDiagnosticConsumer::getHeaderFilter() {		llvm::Regex *ClangTidyDiagnosticConsumer::getHeaderFilter() {
if (!HeaderFilter)		if (!HeaderFilter)
HeaderFilter.reset(		HeaderFilter.reset(
new llvm::Regex(*Context.getOptions().HeaderFilterRegex));		new llvm::Regex(*Context.getOptions().HeaderFilterRegex));
return HeaderFilter.get();		return HeaderFilter.get();
}		}

		void ClangTidyDiagnosticConsumer::removeIncompatibleErrors(
		SmallVectorImpl<ClangTidyError> &Errors) const {
		// Each error is modelled as the set of intervals in which it applies
		// replacements. To detect overlapping replacements, we use a sweep line
		// algorithm over these sets of intervals.
		// An event here consists of the opening or closing of an interval. During the
		// proccess, we maintain a counter with the amount of open intervals. If we
		// find an endpoint of an interval and this counter is different from 0, it
		// means that this interval overlaps with another one, so we set it as
		// inapplicable.
		struct Event {
		// An event can be either the begin or the end of an interval.
		enum EventType {
		ET_Begin = 1,
		ET_End = -1,
		};

		Event(unsigned Begin, unsigned End, EventType Type, unsigned ErrorId,
		unsigned ErrorSize)
		: Type(Type), ErrorId(ErrorId) {
		// The events are going to be sorted by their position. In case of draw:
		klimekUnsubmitted Not Done Reply Inline Actions These need to be documented. klimek: These need to be documented.
		//
		// * If an interval ends at the same position at which other interval
		klimekUnsubmitted Not Done Reply Inline Actions I'd name this Queue instead (reading later I had no idea what this was). klimek: I'd name this Queue instead (reading later I had no idea what this was).
		// begins, this is not an overlapping, so we want to remove the ending
		// interval before adding the starting one: end events have higher
		// priority than begin events.
		//
		// * If we have several begin points at the same position, we will mark as
		// inapplicable the ones that we proccess later, so the first one has to
		// be the one with the latest end point, because this one will contain
		// all the other intervals. For the same reason, if we have several end
		// points in the same position, the last one has to be the one with the
		// earliest begin point. In both cases, we sort non-increasingly by the
		// position of the complementary.
		//
		// * In case of two equal intervals, the one whose error is bigger can
		// potentially contain the other one, so we want to proccess its begin
		// points before and its end points later.
		klimekUnsubmitted Not Done Reply Inline Actions Why are you calling this "Sit"? klimek: Why are you calling this "Sit"?
		//
		// * Finally, if we have two equal intervals whose errors have the same
		// size, none of them will be strictly contained inside the other.
		// Sorting by ErrorId will guarantee that the begin point of the first
		klimekUnsubmitted Not Done Reply Inline Actions The description makes it unclear whether the indices stand for the set, or the dimensions stand for the sets (I can see it's the dimensions from where it is set, and that the indices are bools, but that's unclear without that / hard to grasp). Here's a slightly denormalized way, that uses early-exit and might be easier to understand (more opinions would be nice, though). (note that I probably messed up the FirstInsideSecond setting somewhere :) bool Overlaps = false; bool FirstInsideSecond = false; // In the loop: if (Count[0] != 0 && Count[1] != 0) { Overlaps = true; } else if (Overlaps && ((Count[0] != 0 && FirstInsideSecond) \|\| (Count[1] != 0 && !FirstInsideSecond)) { return OK_Overlap; } else if (Count[0] != 0 \|\| Count[1] != 0) { FirstInsideSecond = Count[1] != 0; } // After the loop: if (!Overlaps) return OK_Disjoint; return FirstInsideSecond ? OK_FirstInsideSecond : OK_SecondInsideFirst; klimek: The description makes it unclear whether the indices stand for the set, or the dimensions stand…
		klimekUnsubmitted Not Done Reply Inline Actions Ok, what I proposed doesn't work. I start to like your solution more. Perhaps we just need to document it better: a) call it Coverage b) introduce enum with Covered and Empty (or similar) c) document: Coverage[Covered][Covered]: both set 1 and set 2 cover an area; Coverage[Covered][Empty]: set 1 covers an area set 2 doesn't cover; Coverage[Empty][Covered]: set 2 covers an area set 1 doesn't cover; klimek: Ok, what I proposed doesn't work. I start to like your solution more. Perhaps we just need to…
		// one will be proccessed before, disallowing the second one, and the
		// end point of the first one will also be proccessed before,
		// disallowing the first one.
		if (Type == ET_Begin)
		Priority = std::make_tuple(Begin, Type, -End, -ErrorSize, ErrorId);
		else
		Priority = std::make_tuple(End, Type, -Begin, ErrorSize, ErrorId);
		}
		klimekUnsubmitted Not Done Reply Inline Actions I'd just call out the 3 cases, like I suggested in my previous comment ("etc" sounds for me like there's more than 1 case left ;) klimek: I'd just call out the 3 cases, like I suggested in my previous comment ("etc" sounds for me…

		bool operator<(const Event &Other) const {
		return Priority < Other.Priority;
		}

		// Determines if this event is the begin or the end of an interval.
		EventType Type;
		// The index of the error to which the interval that generated this event
		// belongs.
		unsigned ErrorId;
		// The events will be sorted based on this field.
		std::tuple<unsigned, EventType, int, int, unsigned> Priority;
		};
		klimekUnsubmitted Not Done Reply Inline Actions I'd say: // If both sets never cover the same range, there is no overlap. klimek: I'd say: // If both sets never cover the same range, there is no overlap.

		// Compute error sizes.
		std::vector<int> Sizes;
		for (const auto &Error : Errors) {
		int Size = 0;
		for (const auto &Replace : Error.Fix)
		Size += Replace.getLength();
		Sizes.push_back(Size);
		}

		// Build events from error intervals.
		std::vector<Event> Events;
		for (unsigned I = 0; I < Errors.size(); ++I) {
		for (const auto &Replace : Errors[I].Fix) {
		klimekUnsubmitted Not Done Reply Inline Actions Why do we need to sort? klimek: Why do we need to sort?
		klimekUnsubmitted Not Done Reply Inline Actions I'd make that: return Coverage[Empty][Covered] ? OK_FirstInsideSecond : OK_SecondInsideFirst; (and adapt the comment) klimek: I'd make that: return Coverage[Empty][Covered] ? OK_FirstInsideSecond : OK_SecondInsideFirst…
		unsigned Begin = Replace.getOffset();
		unsigned End = Begin + Replace.getLength();
		// FIXME: Handle empty intervals, such as those from insertions.
		if (Begin == End)
		continue;
		Events.push_back(Event(Begin, End, Event::ET_Begin, I, Sizes[I]));
		Events.push_back(Event(Begin, End, Event::ET_End, I, Sizes[I]));
		}
		}
		std::sort(Events.begin(), Events.end());

		// Sweep.
		std::vector<bool> Apply(Errors.size(), true);
		int OpenIntervals = 0;
		klimekUnsubmitted Not Done Reply Inline Actions Perhaps call this OpenIntervals, or if you like it short, just Open. klimek: Perhaps call this OpenIntervals, or if you like it short, just Open.
		for (const auto &Event : Events) {
		if (Event.Type == Event::ET_End)
		--OpenIntervals;
		// This has to be checked after removing the interval from the count if it
		klimekUnsubmitted Not Done Reply Inline Actions I think a comment at the top of this function outlining the full algorithm would be nice (without lots of details). klimek: I think a comment at the top of this function outlining the full algorithm would be nice…
		// is an end event, or before adding it if it is a begin event.
		klimekUnsubmitted Not Done Reply Inline Actions s/have/has/ klimek: s/have/has/
		if (OpenIntervals != 0)
		Apply[Event.ErrorId] = false;
		klimekUnsubmitted Not Done Reply Inline Actions I'm somewhat concerned about the quadratic runtime here, for cases where somebody messes up a {} pair in a header and we have 1000s of errors. Can we do a two step algorithm: go over all error-lists, compute bounding rectangles for each sort errors only do the squared algorithm for errors where the bounding rectangles overlap klimek: I'm somewhat concerned about the quadratic runtime here, for cases where somebody messes up a…
		djasperUnsubmitted Not Done Reply Inline Actions From a brief look, it seems like it might be slightly easier, to do what this is currently doing, but skip the bounding-box-calculation step. Simple do the same with each interval of each fix. We are just interested to efficiently calculate, which fixes overlap at all. Now, I am not sure whether this is going to be more or less efficient: It might be hard / not worth it to calculate which fixes have already been compared so we might compare the same two fixes repeatedly (once for each "touching point"). We might not need to carefully compare fixes at all in spite of their bounding boxes overlapping, e.g. two fixes A and B might have 3 entirely disjoint intervals: A, B, A. Then, we don't need to do the sweep at all. djasper: From a brief look, it seems like it might be slightly easier, to do what this is currently…
		if (Event.Type == Event::ET_Begin)
		++OpenIntervals;
		}
		assert(OpenIntervals == 0 && "Amount of begin/end points doesn't match");

		for (unsigned I = 0; I < Errors.size(); ++I) {
		if (!Apply[I]) {
		Errors[I].Fix.clear();
		Errors[I].Notes.push_back(
		ClangTidyMessage("this fix will not be applied because"
		" it overlaps with another fix"));
		}
		}
		}

namespace {		namespace {
struct LessClangTidyError {		struct LessClangTidyError {
bool operator()(const ClangTidyError LHS, const ClangTidyError RHS) const {		bool operator()(const ClangTidyError &LHS, const ClangTidyError &RHS) const {
const ClangTidyMessage &M1 = LHS->Message;		const ClangTidyMessage &M1 = LHS.Message;
const ClangTidyMessage &M2 = RHS->Message;		const ClangTidyMessage &M2 = RHS.Message;

return std::tie(M1.FilePath, M1.FileOffset, M1.Message) <		return std::tie(M1.FilePath, M1.FileOffset, M1.Message) <
std::tie(M2.FilePath, M2.FileOffset, M2.Message);		std::tie(M2.FilePath, M2.FileOffset, M2.Message);
}		}
};		};
		klimekUnsubmitted Not Done Reply Inline Actions Document what goes into the pair - I assumed 'first, second' for a moment, and then was confused. klimek: Document what goes into the pair - I assumed 'first, second' for a moment, and then was…
		struct EqualClangTidyError {
		static LessClangTidyError Less;
		bool operator()(const ClangTidyError &LHS, const ClangTidyError &RHS) const {
		return !Less(LHS, RHS) && !Less(RHS, LHS);
		}
		};
} // end anonymous namespace		} // end anonymous namespace

// Flushes the internal diagnostics buffer to the ClangTidyContext.		// Flushes the internal diagnostics buffer to the ClangTidyContext.
void ClangTidyDiagnosticConsumer::finish() {		void ClangTidyDiagnosticConsumer::finish() {
finalizeLastError();		finalizeLastError();
std::set<const ClangTidyError*, LessClangTidyError> UniqueErrors;
for (const ClangTidyError &Error : Errors)
UniqueErrors.insert(&Error);

for (const ClangTidyError *Error : UniqueErrors)		std::sort(Errors.begin(), Errors.end(), LessClangTidyError());
Context.storeError(*Error);		Errors.erase(std::unique(Errors.begin(), Errors.end(), EqualClangTidyError()),
		Errors.end());
		removeIncompatibleErrors(Errors);

		for (const ClangTidyError &Error : Errors)
		Context.storeError(Error);
Errors.clear();		Errors.clear();
}		}

unittests/clang-tidy/OverlappingReplacementsTest.cpp

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	void registerMatchers(ast_matchers::MatchFinder *Finder) final {
using namespace ast_matchers;		using namespace ast_matchers;
Finder->addMatcher(varDecl(matchesName(NamePattern)).bind(BoundDecl), this);		Finder->addMatcher(varDecl(matchesName(NamePattern)).bind(BoundDecl), this);
}		}

void check(const ast_matchers::MatchFinder::MatchResult &Result) final {		void check(const ast_matchers::MatchFinder::MatchResult &Result) final {
auto *VD = Result.Nodes.getNodeAs<VarDecl>(BoundDecl);		auto *VD = Result.Nodes.getNodeAs<VarDecl>(BoundDecl);
std::string NewName = newName(VD->getName());		std::string NewName = newName(VD->getName());

auto Diag = diag(VD->getLocation(), "refactor")		auto Diag = diag(VD->getLocation(), "refactor %0 into %1")
		<< VD->getName() << NewName
<< FixItHint::CreateReplacement(		<< FixItHint::CreateReplacement(
CharSourceRange::getTokenRange(VD->getLocation(),		CharSourceRange::getTokenRange(VD->getLocation(),
VD->getLocation()),		VD->getLocation()),
NewName);		NewName);

class UsageVisitor : public RecursiveASTVisitor<UsageVisitor> {		class UsageVisitor : public RecursiveASTVisitor<UsageVisitor> {
public:		public:
UsageVisitor(const ValueDecl *VD, StringRef NewName,		UsageVisitor(const ValueDecl *VD, StringRef NewName,
DiagnosticBuilder &Diag)		DiagnosticBuilder &Diag)
: VD(VD), NewName(NewName), Diag(Diag) {}		: VD(VD), NewName(NewName), Diag(Diag) {}
bool VisitDeclRefExpr(DeclRefExpr *E) {		bool VisitDeclRefExpr(DeclRefExpr *E) {
if (const ValueDecl *D = E->getDecl()) {		if (const ValueDecl *D = E->getDecl()) {
▲ Show 20 Lines • Show All 183 Lines • ▼ Show 20 Lines	TEST(OverlappingReplacementsTest, ReplacementInsideOtherReplacement) {
} else if (int a = 0) {		} else if (int a = 0) {
char potato = 0;		char potato = 0;
if (potato) potato;		if (potato) potato;
}		}
})";		})";

// Apply the UseCharCheck together with the IfFalseCheck.		// Apply the UseCharCheck together with the IfFalseCheck.
//		//
// The 'If' fix is bigger, so that is the one that has to be applied.		// The 'If' fix contains the other, so that is the one that has to be applied.
// } else if (int a = 0) {		// } else if (int a = 0) {
// ^^^ -> char		// ^^^ -> char
// ~~~~~~~~~ -> false		// ~~~~~~~~~ -> false
const char CharIfFix[] =		const char CharIfFix[] =
R"(void f() {		R"(void f() {
if (false) {		if (false) {
} else if (false) {		} else if (false) {
char potato = 0;		char potato = 0;
if (false) potato;		if (false) potato;
}		}
})";		})";
Res = runCheckOnCode<UseCharCheck, IfFalseCheck>(Code);		Res = runCheckOnCode<UseCharCheck, IfFalseCheck>(Code);
// FIXME: EXPECT_EQ(CharIfFix, Res);		EXPECT_EQ(CharIfFix, Res);
		Res = runCheckOnCode<IfFalseCheck, UseCharCheck>(Code);
		EXPECT_EQ(CharIfFix, Res);

// Apply the IfFalseCheck with the StartsWithPotaCheck.		// Apply the IfFalseCheck with the StartsWithPotaCheck.
//		//
// The 'If' replacement is bigger here.		// The 'If' replacement is bigger here.
// if (char potato = 0) {		// if (char potato = 0) {
// ^^^^^^ -> tomato		// ^^^^^^ -> tomato
// ~~~~~~~~~~~~~~~ -> false		// ~~~~~~~~~~~~~~~ -> false
//		//
// But the refactoring is bigger here:		// But the refactoring is the one that contains the other here:
// char potato = 0;		// char potato = 0;
// ^^^^^^ -> tomato		// ^^^^^^ -> tomato
// if (potato) potato;		// if (potato) potato;
// ^^^^^^ ^^^^^^ -> tomato, tomato		// ^^^^^^ ^^^^^^ -> tomato, tomato
// ~~~~~~ -> false		// ~~~~~~ -> false
const char IfStartsFix[] =		const char IfStartsFix[] =
R"(void f() {		R"(void f() {
if (false) {		if (false) {
} else if (false) {		} else if (false) {
char tomato = 0;		char tomato = 0;
if (tomato) tomato;		if (tomato) tomato;
}		}
})";		})";
Res = runCheckOnCode<IfFalseCheck, StartsWithPotaCheck>(Code);		Res = runCheckOnCode<IfFalseCheck, StartsWithPotaCheck>(Code);
// FIXME: EXPECT_EQ(IfStartsFix, Res);		EXPECT_EQ(IfStartsFix, Res);
		Res = runCheckOnCode<StartsWithPotaCheck, IfFalseCheck>(Code);
// Silence warnings.		EXPECT_EQ(IfStartsFix, Res);
(void)CharIfFix;
(void)IfStartsFix;
}		}

TEST(OverlappingReplacementsTest, ApplyFullErrorOrNothingWhenOverlapping) {		TEST(OverlappingReplacements, TwoReplacementsInsideOne) {
std::string Res;		std::string Res;
const char Code[] =		const char Code[] =
R"(void f() {		R"(void f() {
int potato = 0;		if (int potato = 0) {
potato += potato * potato;		int a = 0;
if (char this_name_make_this_if_really_long = potato) potato;		}
})";		})";

// StartsWithPotaCheck will try to refactor 'potato' into 'tomato',		// The two smallest replacements should not be applied.
// and EndsWithTatoCheck will try to use 'pomelo'. We have to apply		// if (int potato = 0) {
// either all conversions from one check, or all from the other.		// ^^^^^^ -> tomato
const char StartsFix[] =		// *** -> char
		// ~~~~~~~~~~~~~~ -> false
		// But other errors from the same checks should not be affected.
		// int a = 0;
		// *** -> char
		const char Fix[] =
R"(void f() {		R"(void f() {
int tomato = 0;		if (false) {
tomato += tomato * tomato;		char a = 0;
if (char this_name_make_this_if_really_long = tomato) tomato;		}
})";		})";
const char EndsFix[] =		Res = runCheckOnCode<UseCharCheck, IfFalseCheck, StartsWithPotaCheck>(Code);
		EXPECT_EQ(Fix, Res);
		Res = runCheckOnCode<StartsWithPotaCheck, IfFalseCheck, UseCharCheck>(Code);
		EXPECT_EQ(Fix, Res);
		}

		TEST(OverlappingReplacementsTest,
		ApplyAtMostOneOfTheChangesWhenPartialOverlapping) {
		std::string Res;
		const char Code[] =
R"(void f() {		R"(void f() {
int pomelo = 0;		if (int potato = 0) {
pomelo += pomelo * pomelo;		int a = potato;
if (char this_name_make_this_if_really_long = pomelo) pomelo;		}
})";		})";
// In case of overlapping, we will prioritize the biggest fix. However, these
// two fixes have the same size and position, so we don't know yet which one
// will have preference.
Res = runCheckOnCode<StartsWithPotaCheck, EndsWithTatoCheck>(Code);
// FIXME: EXPECT_TRUE(Res == StartsFix \|\| Res == EndsFix);

// StartsWithPotaCheck will try to refactor 'potato' into 'tomato', but		// These two replacements overlap, but none of them is completely contained
// replacing the 'if' condition is a bigger change than all the refactoring		// inside the other.
// changes together (48 vs 36), so this is the one that is going to be		// if (int potato = 0) {
// applied.		// ^^^^^^ -> tomato
		// ~~~~~~~~~~~~~~ -> false
		// int a = potato;
		// ^^^^^^ -> tomato
		//
		// The 'StartsWithPotaCheck' fix has endpoints inside the 'IfFalseCheck' fix,
		// so it is going to be set as inapplicable. The 'if' fix will be applied.
const char IfFix[] =		const char IfFix[] =
R"(void f() {		R"(void f() {
		if (false) {
		int a = potato;
		}
		})";
		Res = runCheckOnCode<IfFalseCheck, StartsWithPotaCheck>(Code);
		EXPECT_EQ(IfFix, Res);
		}

		TEST(OverlappingReplacementsTest, TwoErrorsHavePerfectOverlapping) {
		std::string Res;
		const char Code[] =
		R"(void f() {
int potato = 0;		int potato = 0;
potato += potato * potato;		potato += potato * potato;
if (true) potato;		if (char a = potato) potato;
})";		})";
Res = runCheckOnCode<StartsWithPotaCheck, IfFalseCheck>(Code);
// FIXME: EXPECT_EQ(IfFix, Res);

// Silence warnings.		// StartsWithPotaCheck will try to refactor 'potato' into 'tomato', and
(void)StartsFix;		// EndsWithTatoCheck will try to use 'pomelo'. Both fixes have the same set of
(void)EndsFix;		// ranges. This is a corner case of one error completely containing another:
(void)IfFix;		// the other completely contains the first one as well. Both errors are
		// discarded.

		Res = runCheckOnCode<StartsWithPotaCheck, EndsWithTatoCheck>(Code);
		EXPECT_EQ(Code, Res);
}		}

} // namespace test		} // namespace test
} // namespace tidy		} // namespace tidy
} // namespace clang		} // namespace clang

This is an archive of the discontinued LLVM Phabricator instance.

Fix overlapping replacements in clang-tidy.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 37569

clang-tidy/ClangTidyDiagnosticConsumer.h

clang-tidy/ClangTidyDiagnosticConsumer.cpp

unittests/clang-tidy/OverlappingReplacementsTest.cpp

Fix overlapping replacements in clang-tidy.
ClosedPublic