This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Add new 'readability-uppercase-literal-suffix' check (CERT DCL16-C, MISRA C:2012, 7.3, MISRA C++:2008, 2-13-4)
ClosedPublic

Authored by lebedev.ri on Sep 28 2018, 1:39 PM.

Details

Summary

Detects when the integral literal or floating point (decimal or hexadecimal)
literal has non-uppercase suffix, and suggests to make the suffix uppercase,
with fix-it.

All valid combinations of suffixes are supported.

auto x = 1;  // OK, no suffix.

auto x = 1u; // warning: integer literal suffix 'u' is not upper-case

auto x = 1U; // OK, suffix is uppercase.

...

References:

  • CERT DCL16-C
  • MISRA C:2012, 7.3 - The lowercase character "l" shall not be used in a literal suffix
  • MISRA C++:2008, 2-13-4 - Literal suffixes shall be upper case

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri created this revision.Sep 28 2018, 1:39 PM
Eugene.Zelenko added inline comments.
docs/ReleaseNotes.rst
96 ↗(On Diff #167540)

Please move alias after new checks.

May be we should also create MISRA module?

lebedev.ri marked an inline comment as done.

Move alias to after the new check,

lebedev.ri added inline comments.Sep 28 2018, 2:33 PM
docs/ReleaseNotes.rst
96 ↗(On Diff #167540)

BTW, is there some tool to actually add this alias? I didn't find it, and had to do it by hand..

  • Deduplicate one test.

Thanks for working on this!

Related as well: http://www.codingstandard.com/rule/4-2-1-ensure-that-the-u-suffix-is-applied-to-a-literal-used-in-a-context-requiring-an-unsigned-integral-expression/
I think its wort a alias is hicpp as well

Please add tests that use user-defined literals and ensure there are no collision and that they are not diagnosed. Some examples: https://en.cppreference.com/w/cpp/language/user_literal

clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
26 ↗(On Diff #167552)

maybe ClangType is not a good name, how about just type to be consistent with e.g. std::vector convention.
The use-case in your template makes it clear, that we are talking about LiteralTypes.

33 ↗(On Diff #167552)

why are these declarations necessary?

42 ↗(On Diff #167552)

The second line of the comment is slightly confusing, please make a full sentence out of it.

52 ↗(On Diff #167552)

as it hit me in my check: what about (1)ul? Is this syntactically correct and should be diagnosed too? (Please add tests if so).

In this case it should be Note.getType().IgnoreParens().getTypePtr()).

53 ↗(On Diff #167552)

Maybe the if could init T? It would require a second return false; if i am not mistaken, but looks more regular to me. No strong opinion from my side.

69 ↗(On Diff #167552)

Same comment as above

85 ↗(On Diff #167552)

These types get really long. Is it possible to put NewSuffix into the anonymous namespace as well?

90 ↗(On Diff #167552)

GIven that this variable is used mutliple times in a longer function, i feel it shuold get a longer, more descriptive name

95 ↗(On Diff #167552)

Please check if the source text could be retrieved, with a final bool parameter, thats in/out and at least assert on that.

106 ↗(On Diff #167552)

please use flloating-point instead of fp (same in other places)

129 ↗(On Diff #167552)

Could this function return a Optional<FixitHint>? That would include the range and the relacement-text. I feel that is would simplify the code, especially the amount of state that one has to keep track of.

135 ↗(On Diff #167552)

I think you can merge this matcher to stmt(eachOf(integerLiteral(intHasSuffix()).bind(), floatLiteral(fpHasSuffix()).bind()))

eachOf because we want to match all, and not short-circuit.

152 ↗(On Diff #167552)

Wouldn't it make sense to warn for the literals in macros?

165 ↗(On Diff #167552)

Lets move this diag into the true branch of the if stmt and drop the ´else`.

165 ↗(On Diff #167552)

I think the warning should point at the suffix, which can be retrieved from the replacement range. Then you don't need to include the suffix itself in the diagnostic

docs/ReleaseNotes.rst
96 ↗(On Diff #167540)

So far there is nothing, but would be a nice addition :)

docs/clang-tidy/checks/readability-uppercase-literal-suffix.rst
9 ↗(On Diff #167552)

I feel that the sentence could be improved a bit.

  • literal has *a* non-uppercase
  • suffix and provides a fix-it-hint with the uppercase suffix. (i think the comma after suffix is not necessary)
test/clang-tidy/readability-uppercase-literal-suffix-floating-point.cpp
26 ↗(On Diff #167552)

nit: comparing int to double, same below.

test/clang-tidy/readability-uppercase-literal-suffix-hexadecimal-floating-point.cpp
6 ↗(On Diff #167552)

Please remove the TMP duplication and make a extra header for that, that is then included by these tests.

31 ↗(On Diff #167552)

you can safely drop the [readability-..] part of the warning message. It will just make problems if the check should be renamed at some point.

lebedev.ri added inline comments.Sep 29 2018, 9:00 AM
docs/ReleaseNotes.rst
96 ↗(On Diff #167540)

Oh, i was really hoping i simply did not find it :/

lebedev.ri marked 13 inline comments as done.

Thank you for taking a look!

  • Rebased
  • Added an alias in hicpp module
  • Addressed most of the review comments.

Thanks for working on this!

Related as well: http://www.codingstandard.com/rule/4-2-1-ensure-that-the-u-suffix-is-applied-to-a-literal-used-in-a-context-requiring-an-unsigned-integral-expression/
I think its wort a alias is hicpp as well

Hmm, after digging but, i think you meant to link to https://www.codingstandard.com/rule/4-3-1-do-not-convert-an-expression-of-wider-floating-point-type-to-a-narrower-floating-point-type/
With some leeway, i think it does talk about the suffix being uppercase.
Added.

Please add tests that use user-defined literals and ensure there are no collision and that they are not diagnosed. Some examples: https://en.cppreference.com/w/cpp/language/user_literal

All those are UserDefinedLiteral, so we should be good.. https://godbolt.org/z/PcGi0B
Also, it seems the suffix can't be set for these constants: https://godbolt.org/z/YHTqke
So i'm not sure what to test. Can you give an example of a test?

clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
33 ↗(On Diff #167552)

Well, because these are *declarations*.
Else the linker complains, since i'm using these in non-constexpr context, but only provided definitions.

42 ↗(On Diff #167552)

Rewrote, hopefully better?

52 ↗(On Diff #167552)

clang says invalid
https://godbolt.org/z/R8bGe_

docs/clang-tidy/checks/readability-uppercase-literal-suffix.rst
9 ↗(On Diff #167552)

Words are the hardest part :)

lebedev.ri added inline comments.Sep 29 2018, 12:09 PM
clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
53 ↗(On Diff #167552)

Then we will not have an early-return, which is worse than this.

85 ↗(On Diff #167552)

No, because shouldReplaceLiteralSuffix() is a member function which returns that type.
I think it should stay a member function, so theoretically NewSuffix could be a [second] template param, but that is kinda ugly..
I also can't simplify it via using because the NewSuffix is private.

Perhaps we should keep this as-is?

95 ↗(On Diff #167552)

I looked at a randomly-selected few dozen calls to this function within clang-tidy, and none of those do this.
But assert added, better than nothing.

129 ↗(On Diff #167552)

I still want to see the old suffix, but i think we can form the FixitHint here, and store it.

135 ↗(On Diff #167552)

bind() still wants the name though.

152 ↗(On Diff #167552)

Very obscure example (like all macros is), but i think it shows the point:

#include <stdio.h>
#include <stdio.h>

#define xstr(s) str(s)
#define str(s) #s

#define dump(X) printf("%u is " str(X) "nits", X);

int main () {
  dump(1u);

  return 0;
}

will normally print 1 is 1units
But if you uppercase it, it will print 1 is 1Units.

I would honestly prefer to give macros a pass here..

165 ↗(On Diff #167552)

If the warnings are aggregated (i.e. not raw make dump), with only the first line shown, the suffix will still be invisible.

Regarding the pointer direction, i'm not sure.
For now i have reworded the diag to justify pointing at the literal itself.

All those are UserDefinedLiteral, so we should be good.. https://godbolt.org/z/PcGi0B
Also, it seems the suffix can't be set for these constants: https://godbolt.org/z/YHTqke
So i'm not sure what to test. Can you give an example of a test?

I am not suggesting that these should be all uppercase or anything. But the tests should demonstrate, that user-defined literals do not impact analysis and give no false positives.

clang-tidy/hicpp/HICPPTidyModule.cpp
24 ↗(On Diff #167620)

spurious formatting fix, can be committed separate.

clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
42 ↗(On Diff #167552)

Yes, few nits.

  • 15 in decimal and is (no comma)
  • skip to instead of skip until
52 ↗(On Diff #167552)

fine then.

53 ↗(On Diff #167552)

Ok. Can the dyn_cast be null at all? Shouldn't integerliteral always be a BuiltinType?

85 ↗(On Diff #167552)

Why does it need to be a member? It looks like it only accesses NewSuffix which does nothing that requires private access to the check class.

152 ↗(On Diff #167552)

Transformation is excluded of course.
The user can still silence the warning in case it is misplaced.

165 ↗(On Diff #167552)

I don't understand that. The warning message does include the source location that would be clearly on the literal suffix and the warning without the suffix printed is clear as well. Having this slightly simpler diagnostic would simplify the code significantly.

90 ↗(On Diff #167620)

you can remove the clang:: as we are in that namespace already.

92 ↗(On Diff #167620)

I think you can drop some of the newlines in this function to make the code a bit more dense.

clang-tidy/utils/ASTUtils.cpp
70 ↗(On Diff #167620)

Please add the missing . at the end of the sentence

docs/ReleaseNotes.rst
117 ↗(On Diff #167620)

I believe the doc links for the aliases dangle right now. Please add the doc pages for them, too.

lebedev.ri marked 8 inline comments as done.Sep 29 2018, 2:11 PM
lebedev.ri added inline comments.
clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
165 ↗(On Diff #167552)

*location*. which will still be a location, if only the first line of the warning is displayed.

We won't even win all that much.
Sure, we will return Optional<FixitHint>, but we will still need to do all that internal stuff to find the old suffix, uppercase it, compare them, etc.

And we lose a very useful ability to check what it considered to be the old suffix in the tests.

It is basically the same question as in D51949.
If you insist, sure, i can do that, but i *really* do believe this is WRONG.

lebedev.ri marked 16 inline comments as done.

Addressed remaining review notes:

  • Fixed dangling links in docs
  • Don't mishandle user-defined literals
  • Don't ignore macros, do check them. Thanks to @AaronBallman for the macro concatenation example :)
  • Partial fix-it support for macros.

...

Herald added subscribers: Restricted Project, llvm-commits. · View Herald TranscriptSep 30 2018, 12:00 PM
lebedev.ri added inline comments.Sep 30 2018, 12:00 PM
clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
53 ↗(On Diff #167552)

I don't know?
I would guess no.

85 ↗(On Diff #167552)

Passing LangOpts turned out to be sufficient to move it into anon namespace.

165 ↗(On Diff #167552)

And one more problem here, where do we point if we don't have a fix-it to get the suffix location from?

lebedev.ri changed the repository for this revision from rCRT Compiler Runtime to rCTE Clang Tools Extra.Sep 30 2018, 12:01 PM
lebedev.ri removed subscribers: llvm-commits, Restricted Project.
JonasToth added inline comments.Sep 30 2018, 12:30 PM
clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
165 ↗(On Diff #167552)

That point is valid, but if we dont have a suffix location, there is no suffix?
I dont insist :)

117 ↗(On Diff #167658)

the comma is not necessary

119 ↗(On Diff #167658)

values are usually not annotated with const, please remove it for consistency

122 ↗(On Diff #167658)

please remove the Now this is troubling.
Please use two or more (what, macros?) instead of two+

192 ↗(On Diff #167658)

ast_matchers:: is not necessary, because there is a using ... at the top of the file

clang-tidy/utils/ASTUtils.cpp
72 ↗(On Diff #167658)

Does it make sense to have a nullptr for SM? I think you can use a reference here

docs/clang-tidy/checks/list.rst
67 ↗(On Diff #167658)

hicpp alias is missing here

lebedev.ri marked 5 inline comments as done.

Address review notes.

clang-tidy/readability/IdentifierNamingCheck.cpp
660–678 ↗(On Diff #167658)

@JonasToth that is what this code did.
From a quick look i'm not sure what it would decide without the source manager,
so i simply refactored it as a function, and kept the semantics.

clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
165 ↗(On Diff #167552)

See test/clang-tidy/readability-uppercase-literal-suffix-integer-macro.cpp, horrible_macros()
I guess if we point at the suffix, we will also get less useful expanded from messages.

Two main points: I don't think this check is covering all of the suffixes (I don't see q or i32 support, for instance), and at least for the CERT rule this is covering, it diagnoses more than it should. The CERT rule is specific to l vs L but imposes no requirements on u vs U.

clang-tidy/readability/ReadabilityTidyModule.cpp
88–89 ↗(On Diff #167661)

Please keep this sorted alphabetically.

clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
25 ↗(On Diff #167661)

It's unfortunate that clang::IntegerLiteral and clang::tidy::readability::IntegerLiteral are distinct types that are strongly related to one another. I'm not keen on this name, as it means reading the code, an unqualified IntegerLiteral is hard to reason about.

30–31 ↗(On Diff #167661)

There are other suffixes for integer literals. See NumericLiteralParser.

37 ↗(On Diff #167661)

Same naming concerns here.

47–48 ↗(On Diff #167661)

There are more suffixes here (see NumericLiteralParser again), and this comment seems to be copy pasta.

99–100 ↗(On Diff #167661)

Don't use auto here.

122 ↗(On Diff #167661)

Don't use auto here.

lebedev.ri marked 8 inline comments as done.

Hopefully address review notes:

  • Support more (all?) obscure suffixes (complex, q, h, f16, i{8,16,32,64})
  • Improve test coverage. There isn't coverage for every single combination/permuation of everything, but hopefully at least the basic coverage.
  • Support a list of destination suffixes, thus the replacements can be finely controlled.
JonasToth added inline comments.Oct 13 2018, 2:52 AM
clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
94 ↗(On Diff #169554)

Is this std::find_if?

lebedev.ri marked an inline comment as done.

Bloat with llvm::find_if()

aaron.ballman accepted this revision.Oct 17 2018, 3:03 PM

LGTM aside from minor wording nits.

clang-tidy/readability/UppercaseLiteralSuffixCheck.cpp
31–32 ↗(On Diff #169556)

The comment is somewhat out of sync with the code because of the i and j suffixes as well.

45 ↗(On Diff #169556)

I think you mean f is both a valid hexadecimal digit in a hex float literal and a valid floating-point literal suffix.

101 ↗(On Diff #169556)

i -> I

212 ↗(On Diff #169556)

have already-uppercase suffix -> have a suffix that is already uppercase

216 ↗(On Diff #169556)

upper-case -> uppercase

docs/clang-tidy/checks/readability-uppercase-literal-suffix.rst
28 ↗(On Diff #169556)

I'd reword this slightly:

"and if a replacement is found that is different from the current suffix, then the diagnostic is issued. This allows for fine-grained control of what suffixes to consider and what their replacements should be."

This revision is now accepted and ready to land.Oct 17 2018, 3:03 PM
lebedev.ri marked 6 inline comments as done.
  • Apply minor wording nits.
  • For cert-dcl16-c, only consider L, LL suffixes, not anything else (not even llu).
  • Apply minor wording nits.
  • For cert-dcl16-c, only consider L, LL suffixes, not anything else (not even llu).

I'll find out about the DCL16-C recommendation, as I suspect the intent is to cover lu and llu but not ul and ull. I've pinged the authors and can hopefully get an answer quickly, but if not, I'm fine with fixing that after the patch goes in.

clang-tidy/cert/CERTTidyModule.cpp
87 ↗(On Diff #170045)

Don't use auto here.

docs/clang-tidy/checks/readability-uppercase-literal-suffix.rst
6 ↗(On Diff #170045)

You should consider calling out the CERT behavioral differences.

  • Apply minor wording nits.
  • For cert-dcl16-c, only consider L, LL suffixes, not anything else (not even llu).

I'll find out about the DCL16-C recommendation, as I suspect the intent is to cover lu and llu but not ul and ull.

I agree, i've thought so too.

That will open an interesting question: in lu, l should be upper-case. What about u? We can't keep it as-is.
We will either consistently upper-case it, or consistently lower-case it.
I.e. given [lL][uU], should we *always* produce Lu, or LU?

I've pinged the authors

Thank you.

and can hopefully get an answer quickly, but if not, I'm fine with fixing that after the patch goes in.

Thank you.

  • Apply minor wording nits.
  • For cert-dcl16-c, only consider L, LL suffixes, not anything else (not even llu).

I'll find out about the DCL16-C recommendation, as I suspect the intent is to cover lu and llu but not ul and ull.

I agree, i've thought so too.

That will open an interesting question: in lu, l should be upper-case. What about u? We can't keep it as-is.
We will either consistently upper-case it, or consistently lower-case it.
I.e. given [lL][uU], should we *always* produce Lu, or LU?

I talked to someone at CERT responsible for maintaining DCL16-C to get their opinion on tightening the wording of the rule and their stated intent is:

"If the first character is 'ell', it should be capitalized. The other ells need not be, and the yew's need not be capitalized either."

e.g.,
11lu -> diagnose
11ul -> fine
11llu -> diagnose
11lLu -> diagnose
11Llu -> fine
11ul -> fine

That said, the author (and I) agree that it'd be perfectly okay to diagnose things like 11Llu and recommend 11LLU as a replacement.

  • Apply minor wording nits.
  • For cert-dcl16-c, only consider L, LL suffixes, not anything else (not even llu).

I'll find out about the DCL16-C recommendation, as I suspect the intent is to cover lu and llu but not ul and ull.

I agree, i've thought so too.

That will open an interesting question: in lu, l should be upper-case. What about u? We can't keep it as-is.
We will either consistently upper-case it, or consistently lower-case it.
I.e. given [lL][uU], should we *always* produce Lu, or LU?

I talked to someone at CERT responsible for maintaining DCL16-C to get their opinion on tightening the wording of the rule and their stated intent is:

Thank you!

"If the first character is 'ell', it should be capitalized. The other ells need not be, and the yew's need not be capitalized either."

e.g.,
11lu -> diagnose
11ul -> fine
11llu -> diagnose
11lLu -> diagnose
11Llu -> fine
11ul -> fine

That said, the author (and I) agree that it'd be perfectly okay to diagnose things like 11Llu and recommend 11LLU as a replacement.

Ok, nothing unexpected.
So the full revised list is: "L;LL:LU;LLU".

I talked to someone at CERT responsible for maintaining DCL16-C to get their opinion on tightening the wording of the rule and their stated intent is:

Thank you!

"If the first character is 'ell', it should be capitalized. The other ells need not be, and the yew's need not be capitalized either."

e.g.,
11lu -> diagnose
11ul -> fine
11llu -> diagnose
11lLu -> diagnose
11Llu -> fine
11ul -> fine

That said, the author (and I) agree that it'd be perfectly okay to diagnose things like 11Llu and recommend 11LLU as a replacement.

Ok, nothing unexpected.
So the full revised list is: "L;LL:LU;LLU".

Agreed. It might be reasonable to add the above as some extra test cases for the CERT check if they're not already covered elsewhere.

I talked to someone at CERT responsible for maintaining DCL16-C to get their opinion on tightening the wording of the rule and their stated intent is:

Thank you!

"If the first character is 'ell', it should be capitalized. The other ells need not be, and the yew's need not be capitalized either."

e.g.,
11lu -> diagnose
11ul -> fine
11llu -> diagnose
11lLu -> diagnose
11Llu -> fine
11ul -> fine

That said, the author (and I) agree that it'd be perfectly okay to diagnose things like 11Llu and recommend 11LLU as a replacement.

Ok, nothing unexpected.
So the full revised list is: "L;LL:LU;LLU".

Agreed. It might be reasonable to add the above as some extra test cases for the CERT check if they're not already covered elsewhere.

Those exist as the test cases for CERT (although, i only test the integer case for CERT, not float.), i will only need to adjust the expected output.

lebedev.ri marked 2 inline comments as done.

CERT: also handle "lu", "llu".

aaron.ballman accepted this revision.Oct 18 2018, 12:06 PM

LGTM! Thanks for the changes!

This revision was automatically updated to reflect the committed changes.
lebedev.ri reopened this revision.Oct 25 2018, 12:47 PM
This revision is now accepted and ready to land.Oct 25 2018, 12:47 PM

Reverted in rL345305 due to multiple apparently-lurking bugs.

Finally finished creducing the issue i wasn't able to understand even after D53719+D53723,
and it turned out to be hasParent()vs.hasAncestor(). (all tests added)

With that fixed, i do not observe any other bugs.

This revision is now accepted and ready to land.Oct 26 2018, 4:06 AM

Rerun on clang-tools-extra+lld+clang codebase, nothing broken.
I'm gonna reland.

This revision was automatically updated to reflect the committed changes.