Download Raw Diff

Details

Reviewers

klimek
djasper

Commits

rGce5e4bc7ac87: Make clang-format cleaner remove redundant commas in list and redundant colon…
rC269888: Make clang-format cleaner remove redundant commas in list and redundant colon…
rL269888: Make clang-format cleaner remove redundant commas in list and redundant colon…

Summary

Make clang-format cleaner remove redundant commas/colons in constructor initializer list.

Diff Detail

Event Timeline

ioeric updated this revision to Diff 55810.May 2 2016, 6:47 AM

ioeric retitled this revision from to Make clang-format cleaner remove redundant commas/colons in constructor initializer list..

ioeric updated this object.

ioeric added reviewers: djasper, klimek.

ioeric added a subscriber: cfe-commits.

Herald added a subscriber: klimek. · View Herald TranscriptMay 2 2016, 6:47 AM

djasper added inline comments.May 8 2016, 12:12 PM

lib/Format/Format.cpp
1821	Why are we restricting this to constructor initializers? I think we should directly be more generic and clean up different lists. Also, as an idea? Could we make this very generic and implement a function that analyzes a line for a specific sequence of tokens? E.g., I would assume that then the current checkConstructorInitList() could be written as: cleanupLeft(tok::comma, tok::comma); cleanupLeft(tok::comma, tok::l_brace); cleanupRight(tok::colon, tok::comma); cleanupLeft(tok::colon, tok::l_brace); With cleanupLeft/Right meaning: Find this sequence of tokens (ignoring comments) and then clean up the left or the right side. Not sure about the exact names of functions etc. What do you think?

ioeric added inline comments.May 9 2016, 3:20 AM

lib/Format/Format.cpp
1821	I think having a generic `cleanupLeft(tok::comma, tok::comma)` is a great idea; however, the other three functions seem a little too generic since they'd probably only be used in constructor initializers? E.g. an expression like `condition ? something : { list }` would be a false positive for `cleanupLeft(tok::colon, tok::l_brace)`, and `[{...}, {...}]` might be a false positive for `cleanupLeft(tok::comma, tok::l_brace)`. Also, it seems to me that `cleanupRight(tok::colon, tok::comma)` would only happen in constructor initializers? I think a mixed solution might work as well. For example, we can run `cleanupLeft(tok::comma, tok::comma)` across all tokens in the code, and then for constructor initializers specifically, we handle redundant colon and trailing comma. What do you think? As for comments, we can probably handle them after all cleanup is done.

Extended redundant comma cleanup to general lists, and change the way constructor initializer list is handled. Removed comments cleanup, leave it for a future patch.

ioeric retitled this revision from Make clang-format cleaner remove redundant commas/colons in constructor initializer list. to Make clang-format cleaner remove redundant commas in list and redundant colon in constructor initializer..May 10 2016, 5:45 AM

PING

I experimented a bit. What do you think of this?

lib/Format/Format.cpp

1822

You could turn this into:

for (auto &Line : AnnotatedLines) {
  if (Line->Affected) {
    cleanupRight(Line->First, tok::comma, tok::comma);
    cleanupRight(Line->First, TT_CtorInitializerColon, tok::comma);
    cleanupLeft(Line->First, tok::comma, tok::l_brace);
    cleanupLeft(Line->First, TT_CtorInitializerColon, tok::l_brace);
  }
}

1912

And all of this into:

// Checks pairs {start, start->next},..., {end->previous, end} and deletes one
// of the token in the pair if the left token has \p LK token kind and the
// right token has \p RK token kind. If \p DeleteLeft is true, the left token
// is deleted on match; otherwise, the right token is deleted.
template <typename LeftKind, typename RightKind>
void cleanupPair(FormatToken *Start, LeftKind LK, RightKind RK,
                 bool DeleteLeft) {
  auto NextNotDeleted = [this](const FormatToken &Tok) -> FormatToken * {
    for (auto *Res = Tok.Next; Res; Res = Res->Next)
      if (!Res->is(tok::comment) &&
          DeletedTokens.find(Res) == DeletedTokens.end())
        return Res;
    return nullptr;
  };
  for (auto *Left = Start; Left;) {
    auto *Right = NextNotDeleted(*Left);
    if (!Right)
      break;
    if (Left->is(LK) && Right->is(RK)) {
      deleteToken(DeleteLeft ? Left : Right);
      // If the right token is deleted, we should keep the left token
      // unchanged and pair it with the new right token.
      if (!DeleteLeft)
        continue;
    }
    Left = Right;
  }
}

template <typename LeftKind, typename RightKind>
void cleanupLeft(FormatToken *Start, LeftKind LK, RightKind RK) {
  cleanupPair(Start, LK, RK, /*DeleteLeft=*/true);
}

template <typename LeftKind, typename RightKind>
void cleanupRight(FormatToken *Start, LeftKind LK, RightKind RK) {
  cleanupPair(Start, LK, RK, /*DeleteLeft=*/false);
}

ioeric added inline comments.May 13 2016, 1:44 AM

lib/Format/Format.cpp
1822	Wouldn't `cleanupLeft(Line->First, tok::comma, tok::l_brace);` also remove the comma from `std::vector<std::vector<int>> = {{...}, {...}}`?

ioeric added inline comments.May 13 2016, 2:08 AM

lib/Format/Format.cpp
1822	I should've added this case into unit test, sorry... But I think we can either handle constructor initializer's tok::l_brace specially or annotate it? The later solution can enable us to do `cleanupLeft(Line->First, tok::comma, TT_CtorInitializerLBrace);`.

Use reviewer's awesome templates for checkPair().
Remove checkConstructorInitList().
Moved InCtorInitializer context setting before checking tok::comma so that InCtorInitializer context can be set even we have syntax error ":," around ctor colon.

lib/Format/Format.cpp
1822	Just found out constructor initializer's commas are already annotated. Then, this can be easily fixed with `cleanupLeft(Line->First, TT_CtorInitializerComma, tok::l_brace);`.
1912	Thanks for the awesome templates!

nit: add a missing space.

Looks good :-).

This revision is now accepted and ready to land.May 17 2016, 10:49 PM

Closed by commit rL269888: Make clang-format cleaner remove redundant commas in list and redundant colon… (authored by ioeric). · Explain WhyMay 18 2016, 1:09 AM

This revision was automatically updated to reflect the committed changes.

Diff 55810

lib/Format/Format.cpp

Show First 20 Lines • Show All 1,810 Lines • ▼ Show 20 Lines	analyze(TokenAnnotator &Annotator,
// To determine if some redundant code is actually introduced by		// To determine if some redundant code is actually introduced by
// replacements(e.g. deletions), we need to come up with a more		// replacements(e.g. deletions), we need to come up with a more
// sophisticated way of computing affected ranges.		// sophisticated way of computing affected ranges.
AffectedRangeMgr.computeAffectedLines(AnnotatedLines.begin(),		AffectedRangeMgr.computeAffectedLines(AnnotatedLines.begin(),
AnnotatedLines.end());		AnnotatedLines.end());

checkEmptyNamespace(AnnotatedLines);		checkEmptyNamespace(AnnotatedLines);

		for (auto &Line : AnnotatedLines) {
		if (Line->Affected)
		checkConstructorInitList(*Line);
		djasperUnsubmitted Not Done Reply Inline Actions Why are we restricting this to constructor initializers? I think we should directly be more generic and clean up different lists. Also, as an idea? Could we make this very generic and implement a function that analyzes a line for a specific sequence of tokens? E.g., I would assume that then the current checkConstructorInitList() could be written as: cleanupLeft(tok::comma, tok::comma); cleanupLeft(tok::comma, tok::l_brace); cleanupRight(tok::colon, tok::comma); cleanupLeft(tok::colon, tok::l_brace); With cleanupLeft/Right meaning: Find this sequence of tokens (ignoring comments) and then clean up the left or the right side. Not sure about the exact names of functions etc. What do you think? djasper: Why are we restricting this to constructor initializers? I think we should directly be more…
		ioericAuthorUnsubmitted Not Done Reply Inline Actions I think having a generic `cleanupLeft(tok::comma, tok::comma)` is a great idea; however, the other three functions seem a little too generic since they'd probably only be used in constructor initializers? E.g. an expression like `condition ? something : { list }` would be a false positive for `cleanupLeft(tok::colon, tok::l_brace)`, and `[{...}, {...}]` might be a false positive for `cleanupLeft(tok::comma, tok::l_brace)`. Also, it seems to me that `cleanupRight(tok::colon, tok::comma)` would only happen in constructor initializers? I think a mixed solution might work as well. For example, we can run `cleanupLeft(tok::comma, tok::comma)` across all tokens in the code, and then for constructor initializers specifically, we handle redundant colon and trailing comma. What do you think? As for comments, we can probably handle them after all cleanup is done. ioeric: I think having a generic `cleanupLeft(tok::comma, tok::comma)` is a great idea; however, the…
		}
		djasperUnsubmitted Done Reply Inline Actions You could turn this into: for (auto &Line : AnnotatedLines) { if (Line->Affected) { cleanupRight(Line->First, tok::comma, tok::comma); cleanupRight(Line->First, TT_CtorInitializerColon, tok::comma); cleanupLeft(Line->First, tok::comma, tok::l_brace); cleanupLeft(Line->First, TT_CtorInitializerColon, tok::l_brace); } } djasper: You could turn this into: for (auto &Line : AnnotatedLines) { if (Line->Affected) {…
		ioericAuthorUnsubmitted Done Reply Inline Actions Wouldn't `cleanupLeft(Line->First, tok::comma, tok::l_brace);` also remove the comma from `std::vector<std::vector<int>> = {{...}, {...}}`? ioeric: Wouldn't `cleanupLeft(Line->First, tok::comma, tok::l_brace);` also remove the comma from `std…
		ioericAuthorUnsubmitted Done Reply Inline Actions I should've added this case into unit test, sorry... But I think we can either handle constructor initializer's tok::l_brace specially or annotate it? The later solution can enable us to do `cleanupLeft(Line->First, tok::comma, TT_CtorInitializerLBrace);`. ioeric: I should've added this case into unit test, sorry... But I think we can either handle…
		ioericAuthorUnsubmitted Not Done Reply Inline Actions Just found out constructor initializer's commas are already annotated. Then, this can be easily fixed with `cleanupLeft(Line->First, TT_CtorInitializerComma, tok::l_brace);`. ioeric: Just found out constructor initializer's commas are already annotated. Then, this can be easily…

return generateFixes();		return generateFixes();
}		}

private:		private:
bool containsOnlyComments(const AnnotatedLine &Line) {		bool containsOnlyComments(const AnnotatedLine &Line) {
for (FormatToken *Tok = Line.First; Tok != nullptr; Tok = Tok->Next) {		for (FormatToken *Tok = Line.First; Tok != nullptr; Tok = Tok->Next) {
if (Tok->isNot(tok::comment))		if (Tok->isNot(tok::comment))
return false;		return false;
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	bool checkEmptyNamespace(SmallVectorImpl<AnnotatedLine *> &AnnotatedLines,

for (unsigned i = InitLine; i <= CurrentLine; ++i) {		for (unsigned i = InitLine; i <= CurrentLine; ++i) {
DeletedLines.insert(i);		DeletedLines.insert(i);
}		}

return true;		return true;
}		}

		// Looks for and removes redundant colon/commas in the constructor initializer
		djasperUnsubmitted Done Reply Inline Actions And all of this into: // Checks pairs {start, start->next},..., {end->previous, end} and deletes one // of the token in the pair if the left token has \p LK token kind and the // right token has \p RK token kind. If \p DeleteLeft is true, the left token // is deleted on match; otherwise, the right token is deleted. template <typename LeftKind, typename RightKind> void cleanupPair(FormatToken Start, LeftKind LK, RightKind RK, bool DeleteLeft) { auto NextNotDeleted = [this](const FormatToken &Tok) -> FormatToken { for (auto Res = Tok.Next; Res; Res = Res->Next) if (!Res->is(tok::comment) && DeletedTokens.find(Res) == DeletedTokens.end()) return Res; return nullptr; }; for (auto Left = Start; Left;) { auto Right = NextNotDeleted(Left); if (!Right) break; if (Left->is(LK) && Right->is(RK)) { deleteToken(DeleteLeft ? Left : Right); // If the right token is deleted, we should keep the left token // unchanged and pair it with the new right token. if (!DeleteLeft) continue; } Left = Right; } } template <typename LeftKind, typename RightKind> void cleanupLeft(FormatToken Start, LeftKind LK, RightKind RK) { cleanupPair(Start, LK, RK, /DeleteLeft=/true); } template <typename LeftKind, typename RightKind> void cleanupRight(FormatToken Start, LeftKind LK, RightKind RK) { cleanupPair(Start, LK, RK, /DeleteLeft=/false); } djasper: And all of this into: // Checks pairs {start, start->next},..., {end->previous, end} and…
		ioericAuthorUnsubmitted Not Done Reply Inline Actions Thanks for the awesome templates! ioeric: Thanks for the awesome templates!
		// list.
		void checkConstructorInitList(AnnotatedLine &Line) {
		FormatToken *Tok = Line.First;
		while (Tok && Tok->Type != TT_CtorInitializerColon) {
		Tok = Tok->Next;
		}
		if (!Tok)
		return;

		assert(Tok->is(tok::colon) && Tok->Type == TT_CtorInitializerColon);
		FormatToken *CtorColonTok = Tok;
		FormatToken *LastTokenNotDeleted = Tok;
		Tok = Tok->Next;
		// This vector stores comments between the last token not deleted and the
		// current token.
		SmallVector<FormatToken *, 1> Comments;
		// True if the initializer list is empty, i.e. the intializer colon is
		// redundant.
		bool IsListEmpty = true;
		while (Tok) {
		switch (Tok->Tok.getKind()) {
		case tok::comma:
		if (LastTokenNotDeleted->isOneOf(tok::comma, tok::colon)) {
		deleteToken(Tok);
		// If there is a new line before the deleted comma, the comment may
		// belong to the previous token.
		if (!Tok->HasUnescapedNewline)
		for (auto *Comment : Comments)
		deleteToken(Comment);
		}
		break;
		case tok::l_paren:
		// We need to skip a pair of parentheses here because it is possible
		// that "(..., { ... })" appears in the initialization list, and we do
		// not want to delete the comma before '{' inside the parentheses.
		if (!Tok->MatchingParen)
		return; // FIXME: error handling.
		Tok = Tok->MatchingParen;
		LastTokenNotDeleted = Tok->Previous;
		break;
		case tok::l_brace:
		// Reach the end of the initializer list; delete the comma at the end of
		// the list.
		if (LastTokenNotDeleted->is(tok::comma)) {
		deleteToken(LastTokenNotDeleted);
		for (auto *Comment : Comments)
		deleteToken(Comment);
		}
		break;
		case tok::comment:
		// If the last deleted token is followed by a comment "//...", then we
		// delete the comment as well.
		if (Tok->Previous &&
		DeletedTokens.find(Tok->Previous) != DeletedTokens.end() &&
		Tok->TokenText.startswith("//"))
		deleteToken(Tok);
		else
		Comments.push_back(Tok);
		break;
		default:
		IsListEmpty = false;
		break;
		}
		if (Tok->isNot(tok::comment)) {
		Comments.clear();
		}
		if (DeletedTokens.find(Tok) == DeletedTokens.end() &&
		Tok->isNot(tok::comment))
		LastTokenNotDeleted = Tok;

		Tok = Tok->Next;
		}

		if (IsListEmpty)
		deleteToken(CtorColonTok);
		}

// Delete the given token.		// Delete the given token.
inline void deleteToken(FormatToken *Tok) {		inline void deleteToken(FormatToken *Tok) {
if (Tok)		if (Tok)
DeletedTokens.insert(Tok);		DeletedTokens.insert(Tok);
}		}

tooling::Replacements generateFixes() {		tooling::Replacements generateFixes() {
tooling::Replacements Fixes;		tooling::Replacements Fixes;
▲ Show 20 Lines • Show All 427 Lines • Show Last 20 Lines

unittests/Format/CleanupTest.cpp

Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	TEST_F(CleanupTest, EmptyNamespaceWithCommentsBreakBeforeBrace) {
std::string Expected = "\n\n\n\n\n\n\n\n\n\n";		std::string Expected = "\n\n\n\n\n\n\n\n\n\n";
std::vector<tooling::Range> Ranges(1, tooling::Range(0, Code.size()));		std::vector<tooling::Range> Ranges(1, tooling::Range(0, Code.size()));
FormatStyle Style = getLLVMStyle();		FormatStyle Style = getLLVMStyle();
Style.BraceWrapping.AfterNamespace = true;		Style.BraceWrapping.AfterNamespace = true;
std::string Result = cleanup(Code, Ranges, Style);		std::string Result = cleanup(Code, Ranges, Style);
EXPECT_EQ(Expected, Result);		EXPECT_EQ(Expected, Result);
}		}

		TEST_F(CleanupTest, CtorInitializationSimpleRedundantComma) {
		std::string Code = "class A {\nA() : , {} };";
		std::string Expected = "class A {\nA() {} };";
		std::vector<tooling::Range> Ranges;
		Ranges.push_back(tooling::Range(17, 0));
		Ranges.push_back(tooling::Range(19, 0));
		std::string Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);

		Code = "class A {\nA() : x(1), {} };";
		Expected = "class A {\nA() : x(1) {} };";
		Ranges.clear();
		Ranges.push_back(tooling::Range(23, 0));
		Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);
		}

		TEST_F(CleanupTest, CtorInitializationBracesInParens) {
		std::string Code = "class A {\nA() : x({1}),, {} };";
		std::string Expected = "class A {\nA() : x({1}) {} };";
		std::vector<tooling::Range> Ranges;
		Ranges.push_back(tooling::Range(24, 0));
		Ranges.push_back(tooling::Range(26, 0));
		std::string Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);
		}

		TEST_F(CleanupTest, RedundantCommaNotInAffectedRanges) {
		std::string Code =
		"class A {\nA() : x({1}), /* comment */, { int x = 0; } };";
		std::string Expected =
		"class A {\nA() : x({1}), /* comment */, { int x = 0; } };";
		// Set the affected range to be "int x = 0", which does not intercept the
		// constructor initialization list.
		std::vector<tooling::Range> Ranges(1, tooling::Range(42, 9));
		std::string Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);

		Code = "class A {\nA() : x(1), {} };";
		Expected = "class A {\nA() : x(1), {} };";
		// No range. Fixer should do nothing.
		Ranges.clear();
		Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);
		}

		TEST_F(CleanupTest, CtorInitializationCommentAroundCommas) {
		// Remove redundant commas and comment between them.
		std::string Code = "class A {\nA() : x({1}), /* comment */, {} };";
		std::string Expected = "class A {\nA() : x({1}) {} };";
		std::vector<tooling::Range> Ranges;
		Ranges.push_back(tooling::Range(25, 0));
		Ranges.push_back(tooling::Range(40, 0));
		std::string Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);

		// Remove trailing comma and comment.
		Code = "class A {\nA() : x({1}), // comment\n{} };";
		Expected = "class A {\nA() : x({1})\n{} };";
		Ranges = std::vector<tooling::Range>(1, tooling::Range(25, 0));
		Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);

		// Remove trailing comma, but leave the comment.
		Code = "class A {\nA() : x({1}), // comment\n , y(1),{} };";
		Expected = "class A {\nA() : x({1}), // comment\n y(1){} };";
		Ranges = std::vector<tooling::Range>(1, tooling::Range(38, 0));
		Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);

		// Remove trailing comma and the comment before it.
		Code = "class A {\nA() : x({1}), \n/* comment */, y(1),{} };";
		Expected = "class A {\nA() : x({1}), \n y(1){} };";
		Ranges = std::vector<tooling::Range>(1, tooling::Range(40, 0));
		Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);

		// Remove trailing comma and the comment after it.
		Code = "class A {\nA() : , // comment\n y(1),{} };";
		Expected = "class A {\nA() : \n y(1){} };";
		Ranges = std::vector<tooling::Range>(1, tooling::Range(17, 0));
		Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);
		}

		TEST_F(CleanupTest, SkipImbalancedParentheses) {
		std::string Code = "class A {\nA() : x((),, {} };";
		std::string Expected = "class A {\nA() : x((),, {} };";
		std::vector<tooling::Range> Ranges(1, tooling::Range(0, Code.size()));
		std::string Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);
		}

		TEST_F(CleanupTest, CtorInitializerInNamespace) {
		std::string Code = "namespace A {\n"
		"namespace B {\n" // missing r_brace
		"} // namespace A\n\n"
		"namespace C {\n"
		"class A { A() : x(0),, {} };\n"
		"inline namespace E { namespace { } }\n"
		"}";
		std::string Expected = "namespace A {\n"
		"\n\n\nnamespace C {\n"
		"class A { A() : x(0) {} };\n \n"
		"}";
		std::vector<tooling::Range> Ranges(1, tooling::Range(0, Code.size()));
		std::string Result = cleanup(Code, Ranges);
		EXPECT_EQ(Expected, Result);
		}

} // end namespace		} // end namespace
} // end namespace format		} // end namespace format
} // end namespace clang		} // end namespace clang

This is an archive of the discontinued LLVM Phabricator instance.

Make clang-format cleaner remove redundant commas in list and redundant colon in constructor initializer.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 55810

lib/Format/Format.cpp

unittests/Format/CleanupTest.cpp

This is an archive of the discontinued LLVM Phabricator instance.

Make clang-format cleaner remove redundant commas in list and redundant colon in constructor initializer.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 55810

lib/Format/Format.cpp

unittests/Format/CleanupTest.cpp

Make clang-format cleaner remove redundant commas in list and redundant colon in constructor initializer.
ClosedPublic