This is an archive of the discontinued LLVM Phabricator instance.

clang-tidy: new check readability-misplaced-array-index
ClosedPublic

Authored by danielmarjamaki on Jun 8 2016, 6:55 AM.

Details

Summary

this is a new check for clang-tidy that diagnoses when it see unusual array index syntax.

there is nothing wrong about such syntax so it's not a misc check. It's just a readability check.

Diff Detail

Event Timeline

danielmarjamaki retitled this revision from to clang-tidy: new check readability-misplaced-array-index.Jun 8 2016, 6:55 AM
danielmarjamaki updated this object.
danielmarjamaki added a reviewer: alexfh.
danielmarjamaki set the repository for this revision to rL LLVM.
danielmarjamaki added a subscriber: cfe-commits.

Please mention this check in docs/ReleaseNotes.rst (in alphabetical order).

danielmarjamaki removed rL LLVM as the repository for this revision.

updated releasedocs
deeper lookup if macro is used
only warn when the replacement can be done safely. the expressions "x[y+z]" and y+z[x]" are not the same.

fixed typo in releasenotes

alexfh edited edge metadata.Jun 16 2016, 9:49 AM

The check seems reasonable, I'm surprised there's no warning in Clang that catches index[array] syntax ;)

A few comments re: implementation.

clang-tidy/readability/MisplacedArrayIndexCheck.cpp
44

Looks like this repeats getText from clang/Tooling/FixIt.h.

74

Looks like you could use createReplacement and/or getText from clang/Tooling/FixIt.h.

alexfh requested changes to this revision.Jun 17 2016, 5:44 AM
alexfh edited edge metadata.
This revision now requires changes to proceed.Jun 17 2016, 5:44 AM
danielmarjamaki added inline comments.Aug 8 2016, 4:47 AM
clang-tidy/readability/MisplacedArrayIndexCheck.cpp
44

Yes indeed..

danielmarjamaki edited edge metadata.
danielmarjamaki set the repository for this revision to rL LLVM.

Fixed review comments.

danielmarjamaki edited edge metadata.
danielmarjamaki removed rL LLVM as the repository for this revision.

Cleanup my previous commit. Ran clang-format.

alexfh added a comment.Aug 8 2016, 6:54 AM

Thanks, that's better. Still a couple of comments.

clang-tidy/readability/MisplacedArrayIndexCheck.cpp
51

I'd say "confusing" instead of "unusual". This would also help avoiding the repetition ("unusual ..., usually ..."):

"confusing array index syntax; usually, the index is inside the brackets"

55

What exactly is the recursive hasMacroID function trying to prevent? Do you have a test that breaks if you just check that the start location of the expression is not a macro?

In most cases, it's enough to check that the whole range is on the same macro expansion level using Lexer::makeFileCharRange in order to prevent most problems with macros, while allowing fixes in safe cases, e.g. replacements in parameters of EXPECT_* macros from gtest.

alexfh requested changes to this revision.Aug 8 2016, 6:54 AM
alexfh edited edge metadata.
This revision now requires changes to proceed.Aug 8 2016, 6:54 AM
aaron.ballman added a subscriber: aaron.ballman.

Is there a reason we don't want this check to be a part of the clang frontend, rather than as a clang-tidy check?

clang-tidy/readability/MisplacedArrayIndexCheck.cpp
28–29

Can this use hasType(pointerType()) instead of hasType(pointsTo(qualType())) ?

49

Should not use auto here because the type is not spelled out in the initialization.

51

Instead of "array index syntax", perhaps "array subscript expression"?

53

What does "contraptions" mean in this context?

I think the replacement is wrong for something like:

int *arr; int offs1, offs2;
offs1[arr + offs2] = 42;

which I think would give:

int *arr; int offs1, offs2;
arr + offs2[offs1] = 42;

If the precedence of the "indexing" argument is less than that of the subscript operator, then you need to insert parens:

int *arr; int offs1, offs2;
(arr + offs2)[offs1] = 42;
alexfh added a comment.Aug 8 2016, 8:37 AM

I think the replacement is wrong for something like:

int *arr; int offs1, offs2;
offs1[arr + offs2] = 42;

which I think would give:

int *arr; int offs1, offs2;
arr + offs2[offs1] = 42;

If the precedence of the "indexing" argument is less than that of the subscript operator, then you need to insert parens:

int *arr; int offs1, offs2;
(arr + offs2)[offs1] = 42;

Indeed. It might be relevant to future (current ones seem to be fine) uses of tooling::fixit::createReplacement as well, so should probably be taken care of in its implementation. Maybe an additional hint from the calling code is needed (bool InsertParensIfNeeded = false).

alexfh added inline comments.Aug 8 2016, 8:51 AM
clang-tidy/readability/MisplacedArrayIndexCheck.cpp
49

While in general I agree with this recommendation, auto Diag = diag(...); has almost become an idiom in this code. I'd not worry about obscurity of auto in this context, especially, if the variable is renamed to Diag, which will make its meaning and possible ways of using it clearer.

aaron.ballman added inline comments.Aug 8 2016, 8:54 AM
clang-tidy/readability/MisplacedArrayIndexCheck.cpp
49

I'm okay with that approach.

danielmarjamaki marked an inline comment as done.Aug 9 2016, 12:29 AM

Is there a reason we don't want this check to be a part of the clang frontend, rather than as a clang-tidy check?

I discussed this with frontend and clang-tidy people in the llvm meeting last year and we came to the conclusion it was better in clang-tidy.

I don't remember the exact reasons.

But I think that this code is just "weird". It could be a bug but I write the warning mostly for readability reasons.

clang-tidy/readability/MisplacedArrayIndexCheck.cpp
44

this is fixed. As you suggested I used createReplacement() in the fixit. It's much nicer!

I think the replacement is wrong for something like:

int *arr; int offs1, offs2;
offs1[arr + offs2] = 42;

which I think would give:

int *arr; int offs1, offs2;
arr + offs2[offs1] = 42;

If the precedence of the "indexing" argument is less than that of the subscript operator, then you need to insert parens:

int *arr; int offs1, offs2;
(arr + offs2)[offs1] = 42;

I don't think so. The matcher says:

hasRHS(ignoringParenImpCasts(
          anyOf(stringLiteral(), declRefExpr(hasType(pointsTo(qualType()))),
                memberExpr(hasType(pointsTo(qualType()))

I think it's hard to know what the replaced code should be. In some cases it might be better with:

arr[offs1 + offs2] = 42;
danielmarjamaki edited edge metadata.

take care of review comments

danielmarjamaki marked 4 inline comments as done.Aug 9 2016, 5:31 AM

as far as I see the only unsolved review comment now is about macros. if it can be handled better.

clang-tidy/readability/MisplacedArrayIndexCheck.cpp
29–30

Yes thanks!

54

you can do lots of weird things with macros. so I wanted to just diagnose and then have no fixits that would be wrong etc. I removed the comment since I think the code is pretty clear.

56

What exactly is the recursive hasMacroID function trying to prevent? Do you have a test that breaks if you just check that the start location of the expression is not a macro?

I am worried about macros anywhere. I did not want to apply fixes for this code: 0[ABC] but maybe using the Lexer would make that safe.

In most cases, it's enough to check that the whole range is on the same macro expansion level using Lexer::makeFileCharRange in order to prevent most problems with macros, while allowing fixes in safe cases, e.g. replacements in parameters of EXPECT_* macros from gtest.

I was very unsuccessful with that. I must do something wrong...

CharSourceRange LRange = Lexer::makeFileCharRange(
    CharSourceRange::getCharRange(
        ArraySubscriptE->getLHS()->getSourceRange()),
    Result.Context->getSourceManager(), Result.Context->getLangOpts());

CharSourceRange RRange = Lexer::makeFileCharRange(
    CharSourceRange::getCharRange(
        ArraySubscriptE->getLHS()->getSourceRange()),
    Result.Context->getSourceManager(), Result.Context->getLangOpts());

StringRef LText = Lexer::getSourceText(LRange, *Result.SourceManager,
                                      Result.Context->getLangOpts());

StringRef RText = Lexer::getSourceText(RRange, *Result.SourceManager,
                                      Result.Context->getLangOpts());

Diag << FixItHint::CreateReplacement(ArraySubscriptE->getLHS()->getSourceRange(),
                                     RText);
Diag << FixItHint::CreateReplacement(ArraySubscriptE->getRHS()->getSourceRange(),
                                  LText);

No fixits worked with that code, with or without macros.

Do you see what I did wrong?

aaron.ballman added inline comments.Aug 9 2016, 5:53 AM
test/clang-tidy/readability-misplaced-array-index.cpp
32

Why is this considered to be "normal syntax" and not worth getting a diagnostic?

danielmarjamaki marked 2 inline comments as done.Aug 9 2016, 6:00 AM
danielmarjamaki added inline comments.
test/clang-tidy/readability-misplaced-array-index.cpp
32

hmm.. I agree that it would be good to have a diagnostic for this code also.

I currently only diagnose when subscript is stringLiteral, declRefExpr or memberExpr. These are the cases when the code can be easily fixed by just replacing the expressions.

I will look into writing a diagnostic for this also.

I think the replacement is wrong for something like:

int *arr; int offs1, offs2;
offs1[arr + offs2] = 42;

which I think would give:

int *arr; int offs1, offs2;
arr + offs2[offs1] = 42;

If the precedence of the "indexing" argument is less than that of the subscript operator, then you need to insert parens:

int *arr; int offs1, offs2;
(arr + offs2)[offs1] = 42;

I don't think so. The matcher says:

hasRHS(ignoringParenImpCasts(
          anyOf(stringLiteral(), declRefExpr(hasType(pointsTo(qualType()))),
                memberExpr(hasType(pointsTo(qualType()))

I think it's hard to know what the replaced code should be. In some cases it might be better with:

arr[offs1 + offs2] = 42;

Oh, yeah, you're right.

danielmarjamaki edited edge metadata.

More generic diagnostic. Diagnose all integerType[pointerType] expressions.

ran clang-format

danielmarjamaki marked an inline comment as done.Aug 9 2016, 7:13 AM
alexfh requested changes to this revision.Aug 16 2016, 9:32 AM
alexfh edited edge metadata.
alexfh added inline comments.
clang-tidy/readability/MisplacedArrayIndexCheck.cpp
57

Both ranges seem to be using ArraySubscriptE->getLHS()->getSourceRange(). I guess, the second one should refer to ArraySubscriptE->getRHS()->getSourceRange() instead?

docs/clang-tidy/checks/readability-misplaced-array-index.rst
13

Please format examples according to LLVM style (specifically, leave braces at the end of the previous line.

test/clang-tidy/readability-misplaced-array-index.cpp
17

Please truncate all CHECK-MESSAGES except for the first one after "confusing array subscript expression".

26

Please add a CHECK-FIXES to verify the original code is left intact. Same for other places where no fixits are expected.

This revision now requires changes to proceed.Aug 16 2016, 9:32 AM
danielmarjamaki edited edge metadata.

fixed review comments

danielmarjamaki marked 6 inline comments as done.Aug 24 2016, 6:54 AM
danielmarjamaki added inline comments.
clang-tidy/readability/MisplacedArrayIndexCheck.cpp
58

I removed hasMacroId() and use fixit::getText(). The replacements look good now.

docs/clang-tidy/checks/readability-misplaced-array-index.rst
14

ok thanks same mistake I've done before. Should I start using uppercase variable names from now on?

alexfh requested changes to this revision.Aug 24 2016, 10:30 AM
alexfh edited edge metadata.

A few nits.

clang-tidy/readability/MisplacedArrayIndexCheck.h
20

Please enclose inline code snippets in backquotes.

docs/clang-tidy/checks/readability-misplaced-array-index.rst
11

This should be .. code-block:: c++.

14

Interesting question. Probably, better to do so.

This revision now requires changes to proceed.Aug 24 2016, 10:30 AM
danielmarjamaki edited edge metadata.
danielmarjamaki marked 2 inline comments as done.

fix review comments

danielmarjamaki marked 4 inline comments as done.Aug 29 2016, 5:24 AM
alexfh accepted this revision.Sep 7 2016, 4:38 AM
alexfh edited edge metadata.

Looks good.

Please ensure it works reasonably on LLVM code before submitting.

Sorry for the delay. Feel free to ping earlier.

This revision is now accepted and ready to land.Sep 7 2016, 4:38 AM
This revision was automatically updated to reflect the committed changes.