This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] [modernize-use-auto] Correct way to calculate a type name length for multi-token types
ClosedPublic

Authored by zinovy.nis on Apr 21 2018, 11:47 PM.

Details

Diff Detail

Event Timeline

zinovy.nis created this revision.Apr 21 2018, 11:47 PM

I think spaces that will be removed should be counted - long long is 9.

OTOH, MinTypeNameLength isn't likely to be set high enough for that to matter.

I think spaces that will be removed should be counted - long long is 9.

I thought about it, but what about "long long int

  • * * *"? Is it still 9?

I think it's a business of clang-format to remove excessive spaces, not of
clang-tidy.

вс, 22 апр. 2018 г. в 10:11, Malcolm Parsons via Phabricator <
reviews@reviews.llvm.org>:

malcolm.parsons added a comment.

I think spaces that will be removed should be counted - long long is 9.

OTOH, MinTypeNameLength isn't likely to be set high enough for that to
matter.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45927

alexfh requested changes to this revision.Apr 30 2018, 8:08 AM
alexfh added inline comments.
clang-tidy/modernize/UseAutoCheck.cpp
33 ↗(On Diff #143465)

Why const char& and not just char? Moreover, these two functions can be replaced with lambdas. See below.

445–449 ↗(On Diff #143465)
static size_t GetTypeNameLength(const TypeLoc &Loc, const ASTContext &Context, bool IgnoreStars) {
  const StringRef S = tooling::fixit::getText(Loc.getSourceRange(), Context);
  if (IgnoreStars)
    return llvm::count_if(S, [] (char C) { return std::isspace(C) || C == '*'; });
  return llvm::count_if(S, [] (char C) { return std::isspace(C); });
}
clang-tidy/modernize/UseAutoCheck.h
31 ↗(On Diff #143465)

I'd make this a static function and remove the SpacePredicate field as well. See the comment below.

This revision now requires changes to proceed.Apr 30 2018, 8:08 AM
zinovy.nis added inline comments.Apr 30 2018, 8:29 AM
clang-tidy/modernize/UseAutoCheck.cpp
33 ↗(On Diff #143465)

Agree.

445–449 ↗(On Diff #143465)

IgnoreStars is initialized once in the ctor and is used widely for all the literals in the translation unit.
IMHO it's better to eliminate branches on IgnoreStars.

Actually, just ignoring spaces may be not the best option.

I think spaces that will be removed should be counted - long long is 9.

I thought about it, but what about "long long int

  • * * *"? Is it still 9?

I think, it's 13, if you choose to remove stars, and 17 otherwise. The difference is excessive spaces vs. required ones. Implementing proper logic may be involved, but we can simplify it to something like "count all non-space characters and a single space between words, but don't count spaces around punctuation":

enum CharType { Space, Alpha, Punctuation };
CharType LastChar = Space, BeforeSpace = Punctuation;
int NumChars = 0;
for (char C : S) {
  CharType NextChar = std::isalnum(C) ? Alpha : (std::isspace(C) || RemoveStars && C == '*') ? Space : Punctuation;
  if (NextChar != Space) {
    ++NumChars; // Count the non-space character.
    if (LastChar == Space && NextChar == Alpha && BeforeSpace == Alpha)
      ++NumChars; // Count a single space character between two words.
    BeforeSpace = NextChar;
  }
  LastChar = NextChar;
}
clang-tidy/modernize/UseAutoCheck.cpp
445–449 ↗(On Diff #143465)

IMHO it's better to eliminate branches on IgnoreStars.

Why do you think so?

I think that spreading the implementation of a rather trivial function to one method, one data member, two functions and a constructor makes the code significantly harder to read. And you for some reason propose to avoid a single branch that doesn't seem to be on any hot path. Am I missing something?

I think, it's 13, if you choose to remove stars, and 17 otherwise. The difference is excessive spaces vs. required ones. Implementing proper logic may be involved, but we can simplify it to something like "count all non-space characters and a single space between words, but don't count spaces around punctuation":

Isn't it a business of clang-format to determine the number of spaces between lexemes? IMO the solutuion you've provided for GetTypeNameLength is enough. Considering punctuation here is overkill :-)

I think, it's 13, if you choose to remove stars, and 17 otherwise. The difference is excessive spaces vs. required ones. Implementing proper logic may be involved, but we can simplify it to something like "count all non-space characters and a single space between words, but don't count spaces around punctuation":

Isn't it a business of clang-format to determine the number of spaces between lexemes? IMO the solutuion you've provided for GetTypeNameLength is enough. Considering punctuation here is overkill :-)

The algorithm in your patch makes decisions based on the assumption that there may be no spaces at all, which is definitely wrong. I'm proposing an algorithm that would at least handle things like unsigned char correctly.

zinovy.nis updated this revision to Diff 144727.May 1 2018, 8:12 AM
  • Applied Alexander's changes.
alexfh requested changes to this revision.May 3 2018, 8:53 AM
alexfh added inline comments.
clang-tidy/modernize-use-auto-min-type-name-length.cpp
61–83

These tests could be more useful, if they verified boundary values of the MinTypeNameLength, e.g. that long int is replaced with auto when the option is set to 8 and it stays long intwith the option set to 9.

Please also add tests with template typenames: e.g. the lenght of T < int > should be considered 6 and the length of T < const int > is 12. I guess, the algorithm I proposed will be incorrect for pointer type arguments of templates (the length of T<int*> should be 7 regardless of RemoveStars), but this can be fixed by conditionally (on RemoveStars) trimming *s from the right of the type name and then treating them as punctuation. So instead of

for (const unsigned char C : tooling::fixit::getText(SR, Context)) {
  const CharType NextChar =
      std::isalnum(C)
          ? Alpha
          : (std::isspace(C) || (!RemoveStars && C == '*')) ? Space
                                                            : Punctuation;

you could use something similar to

StringRef Text = tooling::fixit::getText(SR, Context);
if (RemoveStars)
  Text = Text.rtrim(" \t\v\n\r*");
for (const unsigned char C : Text) {
  const CharType NextChar =
      std::isalnum(C) ? Alpha : std::isspace(C) ? Space : Punctuation;
This revision now requires changes to proceed.May 3 2018, 8:53 AM
zinovy.nis added inline comments.May 5 2018, 12:27 PM
clang-tidy/modernize-use-auto-min-type-name-length.cpp
61–83

These tests could be more useful, if they verified boundary values of the MinTypeNameLength, e.g. that long int is replaced with auto when the option is set to 8 and it stays long intwith the option set to 9.

int-test is just for that :-)

int a = static_cast<int>(foo());                 // ---> int  a = ...
zinovy.nis added inline comments.May 6 2018, 11:28 AM
clang-tidy/modernize-use-auto-min-type-name-length.cpp
61–83

I measured lengths for template cases:

S=std::string *   len=12
S=std::vector<std::string> *   len=25
S=std::vector<const std::string> *   len=31
S=std::string        *   len=12
S=std::vector<std::string>       *   len=25
S=std::vector<const std::string>    *   len=31
zinovy.nis added inline comments.May 6 2018, 12:38 PM
clang-tidy/modernize-use-auto-min-type-name-length.cpp
61–83

RemoveStars==1 here.

rsmith added a subscriber: rsmith.May 6 2018, 3:36 PM
rsmith added inline comments.
modernize/UseAutoCheck.cpp
40

isspace here is wrong: it's a locale-sensitive check. Clang assumes UTF-8 throughout; you can use isWhitespace from clang/Basic/CharInfo.h to test if the character should be treated as whitespace.

alexfh added inline comments.May 9 2018, 7:59 AM
clang-tidy/modernize-use-auto-min-type-name-length.cpp
61–83

S=std::string * len=12

With RemoveStars this should be 11?

S=std::vector<std::string> * len=25

24?

S=std::vector<const std::string> * len=31

30?

The point of my comment was that stars should be treated specially only at the end of the type, not inside template parameters, e.g. T<int****> should have length 10 regardless of RemoveStars. All your examples are with trailing stars.

61–83

int-test is just for that :-)

Nope, the point is to verify that the length of multi-token type names is calculated correctly. int is a single token.

  • Fix for templated type names. Thanks AlexanderK for pointing.
zinovy.nis marked 5 inline comments as done.May 14 2018, 1:46 PM
zinovy.nis marked an inline comment as done.May 16 2018, 2:26 AM
zinovy.nis added inline comments.
modernize/UseAutoCheck.cpp
40

Thanks!

zinovy.nis marked an inline comment as done.May 18 2018, 6:26 AM

Alexander, can you please have a look at the latest patch? Thanks.

This revision is now accepted and ready to land.Jun 6 2018, 8:00 AM
This revision was automatically updated to reflect the committed changes.