This is an archive of the discontinued LLVM Phabricator instance.

Make clang-format cleaner remove redundant commas in list and redundant colon in constructor initializer.
ClosedPublic

Authored by ioeric on May 2 2016, 6:47 AM.

Diff Detail

Event Timeline

ioeric updated this revision to Diff 55810.May 2 2016, 6:47 AM
ioeric retitled this revision from to Make clang-format cleaner remove redundant commas/colons in constructor initializer list..
ioeric updated this object.
ioeric added reviewers: djasper, klimek.
ioeric added a subscriber: cfe-commits.
djasper added inline comments.May 8 2016, 12:12 PM
lib/Format/Format.cpp
1821

Why are we restricting this to constructor initializers? I think we should directly be more generic and clean up different lists. Also, as an idea? Could we make this very generic and implement a function that analyzes a line for a specific sequence of tokens? E.g., I would assume that then the current checkConstructorInitList() could be written as:

cleanupLeft(tok::comma, tok::comma);
cleanupLeft(tok::comma, tok::l_brace);
cleanupRight(tok::colon, tok::comma);
cleanupLeft(tok::colon, tok::l_brace);

With cleanupLeft/Right meaning: Find this sequence of tokens (ignoring comments) and then clean up the left or the right side.

Not sure about the exact names of functions etc. What do you think?

ioeric added inline comments.May 9 2016, 3:20 AM
lib/Format/Format.cpp
1821

I think having a generic cleanupLeft(tok::comma, tok::comma) is a great idea; however, the other three functions seem a little too generic since they'd probably only be used in constructor initializers? E.g. an expression like condition ? something : { list } would be a false positive for cleanupLeft(tok::colon, tok::l_brace), and [{...}, {...}] might be a false positive for cleanupLeft(tok::comma, tok::l_brace). Also, it seems to me that cleanupRight(tok::colon, tok::comma) would only happen in constructor initializers?

I think a mixed solution might work as well. For example, we can run cleanupLeft(tok::comma, tok::comma) across all tokens in the code, and then for constructor initializers specifically, we handle redundant colon and trailing comma. What do you think?

As for comments, we can probably handle them after all cleanup is done.

ioeric updated this revision to Diff 56696.May 10 2016, 5:37 AM
  • Extended redundant comma cleanup to general lists, and change the way constructor initializer list is handled. Removed comments cleanup, leave it for a future patch.
ioeric retitled this revision from Make clang-format cleaner remove redundant commas/colons in constructor initializer list. to Make clang-format cleaner remove redundant commas in list and redundant colon in constructor initializer..May 10 2016, 5:45 AM
djasper edited edge metadata.May 12 2016, 9:46 AM

I experimented a bit. What do you think of this?

lib/Format/Format.cpp
1822

You could turn this into:

for (auto &Line : AnnotatedLines) {
  if (Line->Affected) {
    cleanupRight(Line->First, tok::comma, tok::comma);
    cleanupRight(Line->First, TT_CtorInitializerColon, tok::comma);
    cleanupLeft(Line->First, tok::comma, tok::l_brace);
    cleanupLeft(Line->First, TT_CtorInitializerColon, tok::l_brace);
  }
}
1916

And all of this into:

// Checks pairs {start, start->next},..., {end->previous, end} and deletes one
// of the token in the pair if the left token has \p LK token kind and the
// right token has \p RK token kind. If \p DeleteLeft is true, the left token
// is deleted on match; otherwise, the right token is deleted.
template <typename LeftKind, typename RightKind>
void cleanupPair(FormatToken *Start, LeftKind LK, RightKind RK,
                 bool DeleteLeft) {
  auto NextNotDeleted = [this](const FormatToken &Tok) -> FormatToken * {
    for (auto *Res = Tok.Next; Res; Res = Res->Next)
      if (!Res->is(tok::comment) &&
          DeletedTokens.find(Res) == DeletedTokens.end())
        return Res;
    return nullptr;
  };
  for (auto *Left = Start; Left;) {
    auto *Right = NextNotDeleted(*Left);
    if (!Right)
      break;
    if (Left->is(LK) && Right->is(RK)) {
      deleteToken(DeleteLeft ? Left : Right);
      // If the right token is deleted, we should keep the left token
      // unchanged and pair it with the new right token.
      if (!DeleteLeft)
        continue;
    }
    Left = Right;
  }
}

template <typename LeftKind, typename RightKind>
void cleanupLeft(FormatToken *Start, LeftKind LK, RightKind RK) {
  cleanupPair(Start, LK, RK, /*DeleteLeft=*/true);
}

template <typename LeftKind, typename RightKind>
void cleanupRight(FormatToken *Start, LeftKind LK, RightKind RK) {
  cleanupPair(Start, LK, RK, /*DeleteLeft=*/false);
}
ioeric added inline comments.May 13 2016, 1:44 AM
lib/Format/Format.cpp
1822

Wouldn't cleanupLeft(Line->First, tok::comma, tok::l_brace); also remove the comma from std::vector<std::vector<int>> = {{...}, {...}}?

ioeric added inline comments.May 13 2016, 2:08 AM
lib/Format/Format.cpp
1822

I should've added this case into unit test, sorry...

But I think we can either handle constructor initializer's tok::l_brace specially or annotate it? The later solution can enable us to do cleanupLeft(Line->First, tok::comma, TT_CtorInitializerLBrace);.

ioeric updated this revision to Diff 57145.May 13 2016, 2:37 AM
ioeric marked 4 inline comments as done.
ioeric edited edge metadata.
  • Use reviewer's awesome templates for checkPair().
  • Remove checkConstructorInitList().
  • Moved InCtorInitializer context setting before checking tok::comma so that InCtorInitializer context can be set even we have syntax error ":," around ctor colon.
lib/Format/Format.cpp
1822

Just found out constructor initializer's commas are already annotated. Then, this can be easily fixed with cleanupLeft(Line->First, TT_CtorInitializerComma, tok::l_brace);.

1916

Thanks for the awesome templates!

ioeric updated this revision to Diff 57146.May 13 2016, 2:40 AM
  • nit: add a missing space.
djasper accepted this revision.May 17 2016, 10:49 PM
djasper edited edge metadata.

Looks good :-).

This revision is now accepted and ready to land.May 17 2016, 10:49 PM
This revision was automatically updated to reflect the committed changes.