Page MenuHomePhabricator

[clang-tidy] Add new 'readability-uppercase-literal-suffix' check (CERT DCL16-C, MISRA C:2012, 7.3, MISRA C++:2008, 2-13-4)
ClosedPublic

Authored by lebedev.ri on Sep 28 2018, 1:39 PM.

Details

Summary

Detects when the integral literal or floating point (decimal or hexadecimal)
literal has non-uppercase suffix, and suggests to make the suffix uppercase,
with fix-it.

All valid combinations of suffixes are supported.

auto x = 1;  // OK, no suffix.

auto x = 1u; // warning: integer literal suffix 'u' is not upper-case

auto x = 1U; // OK, suffix is uppercase.

...

References:

  • CERT DCL16-C
  • MISRA C:2012, 7.3 - The lowercase character "l" shall not be used in a literal suffix
  • MISRA C++:2008, 2-13-4 - Literal suffixes shall be upper case

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
lebedev.ri added inline comments.Sep 29 2018, 12:09 PM
clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
53 ↗(On Diff #167552)

Then we will not have an early-return, which is worse than this.

85 ↗(On Diff #167552)

No, because shouldReplaceLiteralSuffix() is a member function which returns that type.
I think it should stay a member function, so theoretically NewSuffix could be a [second] template param, but that is kinda ugly..
I also can't simplify it via using because the NewSuffix is private.

Perhaps we should keep this as-is?

95 ↗(On Diff #167552)

I looked at a randomly-selected few dozen calls to this function within clang-tidy, and none of those do this.
But assert added, better than nothing.

129 ↗(On Diff #167552)

I still want to see the old suffix, but i think we can form the FixitHint here, and store it.

135 ↗(On Diff #167552)

bind() still wants the name though.

152 ↗(On Diff #167552)

Very obscure example (like all macros is), but i think it shows the point:

#include <stdio.h>
#include <stdio.h>

#define xstr(s) str(s)
#define str(s) #s

#define dump(X) printf("%u is " str(X) "nits", X);

int main () {
  dump(1u);

  return 0;
}

will normally print 1 is 1units
But if you uppercase it, it will print 1 is 1Units.

I would honestly prefer to give macros a pass here..

165 ↗(On Diff #167552)

If the warnings are aggregated (i.e. not raw make dump), with only the first line shown, the suffix will still be invisible.

Regarding the pointer direction, i'm not sure.
For now i have reworded the diag to justify pointing at the literal itself.

All those are UserDefinedLiteral, so we should be good.. https://godbolt.org/z/PcGi0B
Also, it seems the suffix can't be set for these constants: https://godbolt.org/z/YHTqke
So i'm not sure what to test. Can you give an example of a test?

I am not suggesting that these should be all uppercase or anything. But the tests should demonstrate, that user-defined literals do not impact analysis and give no false positives.

clang-tidy/hicpp/HICPPTidyModule.cpp
24 ↗(On Diff #167620)

spurious formatting fix, can be committed separate.

clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
42 ↗(On Diff #167552)

Yes, few nits.

  • 15 in decimal and is (no comma)
  • skip to instead of skip until
52 ↗(On Diff #167552)

fine then.

53 ↗(On Diff #167552)

Ok. Can the dyn_cast be null at all? Shouldn't integerliteral always be a BuiltinType?

85 ↗(On Diff #167552)

Why does it need to be a member? It looks like it only accesses NewSuffix which does nothing that requires private access to the check class.

152 ↗(On Diff #167552)

Transformation is excluded of course.
The user can still silence the warning in case it is misplaced.

165 ↗(On Diff #167552)

I don't understand that. The warning message does include the source location that would be clearly on the literal suffix and the warning without the suffix printed is clear as well. Having this slightly simpler diagnostic would simplify the code significantly.

90 ↗(On Diff #167620)

you can remove the clang:: as we are in that namespace already.

92 ↗(On Diff #167620)

I think you can drop some of the newlines in this function to make the code a bit more dense.

clang-tidy/utils/ASTUtils.cpp
70 ↗(On Diff #167620)

Please add the missing . at the end of the sentence

docs/ReleaseNotes.rst
117 ↗(On Diff #167620)

I believe the doc links for the aliases dangle right now. Please add the doc pages for them, too.

lebedev.ri marked 8 inline comments as done.Sep 29 2018, 2:11 PM
lebedev.ri added inline comments.
clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
165 ↗(On Diff #167552)

*location*. which will still be a location, if only the first line of the warning is displayed.

We won't even win all that much.
Sure, we will return Optional<FixitHint>, but we will still need to do all that internal stuff to find the old suffix, uppercase it, compare them, etc.

And we lose a very useful ability to check what it considered to be the old suffix in the tests.

It is basically the same question as in D51949.
If you insist, sure, i can do that, but i *really* do believe this is WRONG.

lebedev.ri marked 16 inline comments as done.

Addressed remaining review notes:

  • Fixed dangling links in docs
  • Don't mishandle user-defined literals
  • Don't ignore macros, do check them. Thanks to @AaronBallman for the macro concatenation example :)
  • Partial fix-it support for macros.

...

Herald added subscribers: Restricted Project, llvm-commits. · View Herald TranscriptSep 30 2018, 12:00 PM
lebedev.ri added inline comments.Sep 30 2018, 12:00 PM
clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
53 ↗(On Diff #167552)

I don't know?
I would guess no.

85 ↗(On Diff #167552)

Passing LangOpts turned out to be sufficient to move it into anon namespace.

165 ↗(On Diff #167552)

And one more problem here, where do we point if we don't have a fix-it to get the suffix location from?

lebedev.ri changed the repository for this revision from rCRT Compiler Runtime to rCTE Clang Tools Extra.Sep 30 2018, 12:01 PM
lebedev.ri removed subscribers: llvm-commits, Restricted Project.
JonasToth added inline comments.Sep 30 2018, 12:30 PM
clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
165 ↗(On Diff #167552)

That point is valid, but if we dont have a suffix location, there is no suffix?
I dont insist :)

117 ↗(On Diff #167658)

the comma is not necessary

119 ↗(On Diff #167658)

values are usually not annotated with const, please remove it for consistency

122 ↗(On Diff #167658)

please remove the Now this is troubling.
Please use two or more (what, macros?) instead of two+

192 ↗(On Diff #167658)

ast_matchers:: is not necessary, because there is a using ... at the top of the file

clang-tidy/utils/ASTUtils.cpp
72 ↗(On Diff #167658)

Does it make sense to have a nullptr for SM? I think you can use a reference here

docs/clang-tidy/checks/list.rst
67 ↗(On Diff #167658)

hicpp alias is missing here

lebedev.ri marked 5 inline comments as done.

Address review notes.

clang-tidy/readability/IdentifierNamingCheck.cpp
660–678 ↗(On Diff #167658)

@JonasToth that is what this code did.
From a quick look i'm not sure what it would decide without the source manager,
so i simply refactored it as a function, and kept the semantics.

clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
165 ↗(On Diff #167552)

See test/clang-tidy/readability-uppercase-literal-suffix-integer-macro.cpp, horrible_macros()
I guess if we point at the suffix, we will also get less useful expanded from messages.

Two main points: I don't think this check is covering all of the suffixes (I don't see q or i32 support, for instance), and at least for the CERT rule this is covering, it diagnoses more than it should. The CERT rule is specific to l vs L but imposes no requirements on u vs U.

clang-tidy/readability/ReadabilityTidyModule.cpp
88–89 ↗(On Diff #167661)

Please keep this sorted alphabetically.

clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
25 ↗(On Diff #167661)

It's unfortunate that clang::IntegerLiteral and clang::tidy::readability::IntegerLiteral are distinct types that are strongly related to one another. I'm not keen on this name, as it means reading the code, an unqualified IntegerLiteral is hard to reason about.

30–31 ↗(On Diff #167661)

There are other suffixes for integer literals. See NumericLiteralParser.

37 ↗(On Diff #167661)

Same naming concerns here.

47–48 ↗(On Diff #167661)

There are more suffixes here (see NumericLiteralParser again), and this comment seems to be copy pasta.

99–100 ↗(On Diff #167661)

Don't use auto here.

122 ↗(On Diff #167661)

Don't use auto here.

lebedev.ri marked 8 inline comments as done.

Hopefully address review notes:

  • Support more (all?) obscure suffixes (complex, q, h, f16, i{8,16,32,64})
  • Improve test coverage. There isn't coverage for every single combination/permuation of everything, but hopefully at least the basic coverage.
  • Support a list of destination suffixes, thus the replacements can be finely controlled.
JonasToth added inline comments.Oct 13 2018, 2:52 AM
clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
94 ↗(On Diff #169554)

Is this std::find_if?

lebedev.ri marked an inline comment as done.

Bloat with llvm::find_if()

aaron.ballman accepted this revision.Oct 17 2018, 3:03 PM

LGTM aside from minor wording nits.

clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
31–32 ↗(On Diff #169556)

The comment is somewhat out of sync with the code because of the i and j suffixes as well.

45 ↗(On Diff #169556)

I think you mean f is both a valid hexadecimal digit in a hex float literal and a valid floating-point literal suffix.

101 ↗(On Diff #169556)

i -> I

212 ↗(On Diff #169556)

have already-uppercase suffix -> have a suffix that is already uppercase

216 ↗(On Diff #169556)

upper-case -> uppercase

docs/clang-tidy/checks/readability-uppercase-literal-suffix.rst
28 ↗(On Diff #169556)

I'd reword this slightly:

"and if a replacement is found that is different from the current suffix, then the diagnostic is issued. This allows for fine-grained control of what suffixes to consider and what their replacements should be."

This revision is now accepted and ready to land.Oct 17 2018, 3:03 PM
lebedev.ri marked 6 inline comments as done.
  • Apply minor wording nits.
  • For cert-dcl16-c, only consider L, LL suffixes, not anything else (not even llu).
  • Apply minor wording nits.
  • For cert-dcl16-c, only consider L, LL suffixes, not anything else (not even llu).

I'll find out about the DCL16-C recommendation, as I suspect the intent is to cover lu and llu but not ul and ull. I've pinged the authors and can hopefully get an answer quickly, but if not, I'm fine with fixing that after the patch goes in.

clang-tidy/cert/CERTTidyModule.cpp
87 ↗(On Diff #170045)

Don't use auto here.

docs/clang-tidy/checks/readability-uppercase-literal-suffix.rst
6 ↗(On Diff #170045)

You should consider calling out the CERT behavioral differences.

  • Apply minor wording nits.
  • For cert-dcl16-c, only consider L, LL suffixes, not anything else (not even llu).

I'll find out about the DCL16-C recommendation, as I suspect the intent is to cover lu and llu but not ul and ull.

I agree, i've thought so too.

That will open an interesting question: in lu, l should be upper-case. What about u? We can't keep it as-is.
We will either consistently upper-case it, or consistently lower-case it.
I.e. given [lL][uU], should we *always* produce Lu, or LU?

I've pinged the authors

Thank you.

and can hopefully get an answer quickly, but if not, I'm fine with fixing that after the patch goes in.

Thank you.

  • Apply minor wording nits.
  • For cert-dcl16-c, only consider L, LL suffixes, not anything else (not even llu).

I'll find out about the DCL16-C recommendation, as I suspect the intent is to cover lu and llu but not ul and ull.

I agree, i've thought so too.

That will open an interesting question: in lu, l should be upper-case. What about u? We can't keep it as-is.
We will either consistently upper-case it, or consistently lower-case it.
I.e. given [lL][uU], should we *always* produce Lu, or LU?

I talked to someone at CERT responsible for maintaining DCL16-C to get their opinion on tightening the wording of the rule and their stated intent is:

"If the first character is 'ell', it should be capitalized. The other ells need not be, and the yew's need not be capitalized either."

e.g.,
11lu -> diagnose
11ul -> fine
11llu -> diagnose
11lLu -> diagnose
11Llu -> fine
11ul -> fine

That said, the author (and I) agree that it'd be perfectly okay to diagnose things like 11Llu and recommend 11LLU as a replacement.

  • Apply minor wording nits.
  • For cert-dcl16-c, only consider L, LL suffixes, not anything else (not even llu).

I'll find out about the DCL16-C recommendation, as I suspect the intent is to cover lu and llu but not ul and ull.

I agree, i've thought so too.

That will open an interesting question: in lu, l should be upper-case. What about u? We can't keep it as-is.
We will either consistently upper-case it, or consistently lower-case it.
I.e. given [lL][uU], should we *always* produce Lu, or LU?

I talked to someone at CERT responsible for maintaining DCL16-C to get their opinion on tightening the wording of the rule and their stated intent is:

Thank you!

"If the first character is 'ell', it should be capitalized. The other ells need not be, and the yew's need not be capitalized either."

e.g.,
11lu -> diagnose
11ul -> fine
11llu -> diagnose
11lLu -> diagnose
11Llu -> fine
11ul -> fine

That said, the author (and I) agree that it'd be perfectly okay to diagnose things like 11Llu and recommend 11LLU as a replacement.

Ok, nothing unexpected.
So the full revised list is: "L;LL:LU;LLU".

I talked to someone at CERT responsible for maintaining DCL16-C to get their opinion on tightening the wording of the rule and their stated intent is:

Thank you!

"If the first character is 'ell', it should be capitalized. The other ells need not be, and the yew's need not be capitalized either."

e.g.,
11lu -> diagnose
11ul -> fine
11llu -> diagnose
11lLu -> diagnose
11Llu -> fine
11ul -> fine

That said, the author (and I) agree that it'd be perfectly okay to diagnose things like 11Llu and recommend 11LLU as a replacement.

Ok, nothing unexpected.
So the full revised list is: "L;LL:LU;LLU".

Agreed. It might be reasonable to add the above as some extra test cases for the CERT check if they're not already covered elsewhere.

I talked to someone at CERT responsible for maintaining DCL16-C to get their opinion on tightening the wording of the rule and their stated intent is:

Thank you!

"If the first character is 'ell', it should be capitalized. The other ells need not be, and the yew's need not be capitalized either."

e.g.,
11lu -> diagnose
11ul -> fine
11llu -> diagnose
11lLu -> diagnose
11Llu -> fine
11ul -> fine

That said, the author (and I) agree that it'd be perfectly okay to diagnose things like 11Llu and recommend 11LLU as a replacement.

Ok, nothing unexpected.
So the full revised list is: "L;LL:LU;LLU".

Agreed. It might be reasonable to add the above as some extra test cases for the CERT check if they're not already covered elsewhere.

Those exist as the test cases for CERT (although, i only test the integer case for CERT, not float.), i will only need to adjust the expected output.

lebedev.ri marked 2 inline comments as done.

CERT: also handle "lu", "llu".

aaron.ballman accepted this revision.Oct 18 2018, 12:06 PM

LGTM! Thanks for the changes!

This revision was automatically updated to reflect the committed changes.
lebedev.ri reopened this revision.Oct 25 2018, 12:47 PM
This revision is now accepted and ready to land.Oct 25 2018, 12:47 PM

Reverted in rL345305 due to multiple apparently-lurking bugs.

Finally finished creducing the issue i wasn't able to understand even after D53719+D53723,
and it turned out to be hasParent()vs.hasAncestor(). (all tests added)

With that fixed, i do not observe any other bugs.

This revision is now accepted and ready to land.Oct 26 2018, 4:06 AM

Rerun on clang-tools-extra+lld+clang codebase, nothing broken.
I'm gonna reland.

This revision was automatically updated to reflect the committed changes.