This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tools-extra/
-
clang-tidy/modernize/
-
modernize/
-
CMakeLists.txt
4/4
IntegralLiteralExpressionMatcher.h
30/31
IntegralLiteralExpressionMatcher.cpp
1/1
MacroToEnumCheck.cpp
-
docs/clang-tidy/checks/
-
clang-tidy/
-
checks/
2/2
modernize-macro-to-enum.rst
-
test/clang-tidy/checkers/
-
clang-tidy/
-
checkers/
2/2
modernize-macro-to-enum.cpp
-
unittests/clang-tidy/
-
clang-tidy/
-
CMakeLists.txt
12/12
ModernizeModuleTest.cpp

Differential D124500

[clang-tidy] Support expressions of literals in modernize-macro-to-enum
ClosedPublic

Authored by LegalizeAdulthood on Apr 26 2022, 8:29 PM.

Download Raw Diff

Details

Reviewers

aaron.ballman
njames93

Commits

rG512273833136: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

Summary

Add a recursive descent parser to match macro expansion tokens against
fully formed valid expressions of integral literals. Partial expressions will
not be matched -- they can't be valid initializing expressions for an enum.

Fixes #55055

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

LegalizeAdulthood created this revision.Apr 26 2022, 8:29 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2022, 8:29 PM

Herald added subscribers: carlosgalvezp, xazax.hun, mgorny. · View Herald Transcript

LegalizeAdulthood requested review of this revision.Apr 26 2022, 8:29 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2022, 8:29 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

LegalizeAdulthood set the repository for this revision to rG LLVM Github Monorepo.Apr 26 2022, 8:29 PM

LegalizeAdulthood added inline comments.Apr 26 2022, 8:32 PM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
2	ditto
293	Structure of these feels very similar, I'll see if I can squish out the duplication
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
2	Uh.... I guess this needs some sort of copyright notice?
clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
325	inline variable made explicit for debugging

Harbormaster completed remote builds in B161525: Diff 425407.Apr 26 2022, 8:58 PM

Add banner block comment for new files
Extract Functions to eliminate duplication

Inline Variable

Harbormaster completed remote builds in B161538: Diff 425425.Apr 27 2022, 12:07 AM

LegalizeAdulthood added inline comments.Apr 27 2022, 8:35 AM

clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
2	Needs a file header

Add file block comment

Harbormaster completed remote builds in B161727: Diff 425679.Apr 27 2022, 7:26 PM

Update documentation

Harbormaster completed remote builds in B161742: Diff 425700.Apr 27 2022, 11:33 PM

aaron.ballman added inline comments.Apr 29 2022, 7:43 AM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
26	(Same suggestion elsewhere -- just double check that all the comments are full sentences with capitalization and punctuation.)
33–34	Do we need to care about integer suffixes that make a non-integer type, like: https://godbolt.org/z/vx3xbGa41
100	I know this is code moved from elsewhere, but I suppose we never considered the odd edge case where a user does something like `"foo"[0]` as a really awful integer constant. :-D
186	There is GNU extension in this space: https://godbolt.org/z/PrWY3T6hY
205	Comma operator?
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
10	We don't use #pragma once (not portable, not reliable).
19–21	Oh boy, I'm not super excited about having another parser to maintain... It'd be nice if we had a ParserUtils.cpp/h file that made it easier to go from an arbitrary array of tokens to AST nodes + success/failure information on parsing the tokens. It's not strictly needed for what you're trying to accomplish here, but it would be a much more general interface and it would remove the support burden from adding another parser that's out of Clang's tree.
clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
26	Other interesting tests I'd expect we could convert into an enum (at least theoretically): #define A 12 + +1 #define B 12 - -1 #define C (1, 2, 3) #define D 100 ? : 8 #define E 100 ? 100 : 8 #define F 'f' #define G "foo"[0] #define H 1 && 2 #define I 1 \|\| 2
clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
67	12i .0
100	100 ? : 10 1, 2
135	This one is valid

LegalizeAdulthood marked 2 inline comments as done.Apr 29 2022, 10:45 AM

LegalizeAdulthood added inline comments.

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
33–34	I don't think those will be parsed as literal tokens by the preprocessor, but I'll check.
100	It's always possible to write crazy contorted code and have a check not recognize it. I don't think it's worthwhile to spend time trying to handle torture cases unless we can find data that shows it's prevalent in real world code. If I was doing a code review and saw this: enum { FOO = "foo"[0] }; I'd flag that in a code review as bogus, whereas if I saw: enum { FOO = 'f' }; That would be acceptable, which is why character literals are accepted as an integral literal initializer for an enum in this check.
186	Do you have a link for the extension?
205	Remember that the use case here is identifying expressions that are initializers for an enum. If you were doing a code review and you saw this: enum { FOO = (2, 3) }; Would you be OK with that? I wouldn't. Clang even warns about it: https://godbolt.org/z/Y641cb8Y9 Therefore I deliberately left comma operator out of the grammar.
clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
26	Most of these (except comma operator and string subscript, see my comments earlier) are covered in the unit test for the matcher. I'll add tests for these: 12 + +1 12 - -1 100 ? : 8
clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
67	`.0` is already covered by the case `1.23`. I'm not home brewing tokenization, but using the Lexer to do that. `12i` I need to investigate to find out what the Lexer does.

aaron.ballman added inline comments.Apr 29 2022, 11:50 AM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	I don't think it's worthwhile to spend time trying to handle torture cases unless we can find data that shows it's prevalent in real world code. I think I'm okay agreeing to that in this particular case, but this is more to point out that writing your own parser is a maintenance burden. Users will hit cases we've both forgotten about here, they'll file a bug, then someone has to deal with it. It's very hard to justify to users "we think you write silly code" because they often have creative ways in which their code is not actually so silly, especially when we support "most" valid expressions.
186	https://gcc.gnu.org/onlinedocs/gcc/Conditionals.html
205	This is another case where I think you're predicting that users won't be using the full expressivity of the language and we'll get bug reports later. Again, in insolation, I tend to agree that I wouldn't be happy seeing that code. However, users write some very creative code and there's no technical reason why we can't or shouldn't handle comma operators.

To clarify my previous comments, I'm fine punting on the edge cases until user reports come in, so don't let them block this review if you feel strongly about not supporting them. But when user reports start coming in, at some point I might start asking to replace the custom parser with calling into the clangParse library through some invented utility interface so that we don't have to deal with a long tail of bug reports.

LegalizeAdulthood marked 3 inline comments as done.Apr 29 2022, 12:40 PM

LegalizeAdulthood added inline comments.

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	Writing your own parser is unavoidable here because we can't just assume that any old thing will be a valid initializer just by looking at the set of tokens present in the macro body. (There is a separate discussion going on about improving the preprocessor support and parsing things more deeply, but that isn't even to the point of a prototype yet.) The worst thing we can do is create "fixits" that produce invalid code. The worst that happens if your expression isn't recognized is that your macro isn't matched as a candidate for an enum. You can always make it an enum manually and join it with adjacent macros that were recognized and converted. As it stands, the check only recognizes a single literal with an optional unary operator. This change expands the check to recognize a broad range of expressions, allowing those macros to be converted to enums. I opened the issue because running modernize-macro-to-enum on the ITK codebase showed some simple expressions involving literals that weren't recognized and converted. If an expression isn't recognized and an issue is opened, it will be an enhancement request to support a broader range of expression, not a bug that this check created invalid code. IMO, the more useful thing that's missing from the grammar is recognizing `sizeof` expressions rather than indexing string literals with an integral literal subscript. I had planned on doing another increment to recognize `sizeof` expressions.
205	"Don't let the perfect be the enemy of the good." My inclination is to simply explicitly state that comma operator is not recognized in the documentation. It's already implicit by it's absence from the list of recognized operators. Again, the worst that happens is that your macro isn't converted. I'm open to being convinced that it's important, but you haven't convinced me yet `:)`

In D124500#3483224, @aaron.ballman wrote:

To clarify my previous comments, I'm fine punting on the edge cases until user reports come in, so don't let them block this review if you feel strongly about not supporting them. But when user reports start coming in, at some point I might start asking to replace the custom parser with calling into the clangParse library through some invented utility interface so that we don't have to deal with a long tail of bug reports.

Yeah, I think punting on edge cases is the right thing to do here. As I say,
the worst that happens is your macro isn't converted automatically when you
could convert it manually.

Maybe we're thinking about this check differently.

I want this check to do the majority of the heavy lifting so that I'm only left with a
few "weird" macros that I might have to convert by hand. I never, ever, ever want
this check to generate invalid code. Therefore it is of paramount importance that
it be conservative about what it recognizes as a candidate for conversion.

I think this is the baseline for all the modernize checks, really. There are still cases
where loop-convert doesn't recognize an iterator based loop that could be
converted to a range-based for loop. The most important thing is that loop-convert
doesn't take my weird iterator based loop and convert it to a range based for loop
that doesn't compile.

As for calling into clangParse, I think that would be overkill for a couple reasons.
First, the real parser is going to do a lot of work creating AST nodes which we will
never use, except to traverse the structure looking for things that would invalidate
it as a candidate for a constant initializing expression. Second, we only need to
match the structure, we don't need to extract any information from the token stream
other than a "thumbs up" or "thumbs down" that it is a valid initializing expression.
Many times in clang-tidy reviews performance concerns are raised and I think
matching the token stream with the recursive descent matcher here is going to be
much, much faster than invoking clangParse, particularly since the matcher bails
out early on the first token that doesn't match. The only thing I can think of that
would make it faster is if we could get the lexed tokens from the preprocessor
instead of making it re-lex the macro body, but that's a change beyond the scope
of this check or even clang-tidy.

LegalizeAdulthood marked 3 inline comments as done.Apr 29 2022, 1:02 PM

LegalizeAdulthood added inline comments.

clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
67	OK, so `12i` turns into `numeric_constant` token, so I've added test cases to exclude those and enhanced the matcher. Essentially that's a bug in the existing implementation that `12i` wasn't rejected outright.

LegalizeAdulthood marked 7 inline comments as done.Apr 29 2022, 1:19 PM

LegalizeAdulthood added inline comments.

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
19–21	Yeah, I'm not a fan of duplication either, but see my earlier comments about why I think clangParse is overkill here.

Update from review comments

Harbormaster completed remote builds in B162055: Diff 426160.Apr 29 2022, 2:43 PM

LegalizeAdulthood added a reviewer: njames93.Apr 30 2022, 11:22 AM

LegalizeAdulthood added inline comments.Apr 30 2022, 9:03 PM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
205	It wasn't much extra work/code to add comma operator support so I've done that.

Recognize comma operator expressions

FYI, once nice addition of the parsing of macro bodies is that it paves the way for
a modernize-macro-to-function check that converts function-like macros that
compute values to template functions. Once this change has landed, I'll be putting
up a review for that.

Harbormaster completed remote builds in B162133: Diff 426263.Apr 30 2022, 9:27 PM

In D124500#3483328, @LegalizeAdulthood wrote:

In D124500#3483224, @aaron.ballman wrote:

To clarify my previous comments, I'm fine punting on the edge cases until user reports come in, so don't let them block this review if you feel strongly about not supporting them. But when user reports start coming in, at some point I might start asking to replace the custom parser with calling into the clangParse library through some invented utility interface so that we don't have to deal with a long tail of bug reports.

Yeah, I think punting on edge cases is the right thing to do here. As I say,
the worst that happens is your macro isn't converted automatically when you
could convert it manually.

I largely agree, but I've found cases where we'll convert correct code to incorrect code, so it's a bit stronger than that.

Maybe we're thinking about this check differently.

I want this check to do the majority of the heavy lifting so that I'm only left with a
few "weird" macros that I might have to convert by hand. I never, ever, ever want
this check to generate invalid code. Therefore it is of paramount importance that
it be conservative about what it recognizes as a candidate for conversion.

I think that's a reasonable goal, but we're not meeting the "never ever generate invalid code" part. I already know we can break correct C and C++ code through overflow. Should we ever allow an option to use an enum with a fixed underlying type in C++ (either enum class or enum : type form), we'll have the same breakage there but at different thresholds.

I think this is the baseline for all the modernize checks, really. There are still cases
where loop-convert doesn't recognize an iterator based loop that could be
converted to a range-based for loop. The most important thing is that loop-convert
doesn't take my weird iterator based loop and convert it to a range based for loop
that doesn't compile.

As for calling into clangParse, I think that would be overkill for a couple reasons.
First, the real parser is going to do a lot of work creating AST nodes which we will
never use, except to traverse the structure looking for things that would invalidate
it as a candidate for a constant initializing expression. Second, we only need to
match the structure, we don't need to extract any information from the token stream
other than a "thumbs up" or "thumbs down" that it is a valid initializing expression.

There's a few reasons I disagree with this. First, you need to know the value of the constant expression in order to know whether it's valid as an enumeration constant. That alone requires expression evaluation capabilities, assuming you want the check to behave correctly for those kinds of cases. But second, without support for generating that AST and validating it, you can never handle cases like this:

constexpr int a = 12;
constexpr int foo() { return 12; }

#define FOO (a + 1)
#define BAR (a + 2)
#define BAZ (a + 3)
#define QUUX (foo() + 4)

because you have no way to know whether or not that constant expression is valid based solely on token soup (especially when you start to factor in namespaces, qualified lookup, template instantiations, etc). So I think avoiding the parser will limit the utility of this check. And maybe that's fine, maybe it's only ever intended to convert C code using defines to C code using enumerations or other such simple cases.

Many times in clang-tidy reviews performance concerns are raised and I think
matching the token stream with the recursive descent matcher here is going to be
much, much faster than invoking clangParse, particularly since the matcher bails
out early on the first token that doesn't match.

Absolutely 100% agreed on this point.

The only thing I can think of that
would make it faster is if we could get the lexed tokens from the preprocessor
instead of making it re-lex the macro body, but that's a change beyond the scope
of this check or even clang-tidy.

Yeah, and that wouldn't even be sufficient because I still think we're going to want to know the *value* of the expression at some point.

Allllll that "this is the wrong way!" doom and gloom aside, I still think you're fine to proceed with the current approach if you'd like to. The situations under which it will break correct code should be something we document explicitly in the check somewhere, but they feel sufficiently like edge cases to me that I'm fine moving forward with the caution in mind that if this becomes too much more problematic in the future, we have *a* path forward we could use to improve things should we decide we need to.

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	Writing your own parser is unavoidable here because we can't just assume that any old thing will be a valid initializer just by looking at the set of tokens present in the macro body. If you ran the token sequence through clang's parser and got an AST node out, you'd have significantly more information as to whether something is a valid enum constant initializer because you can check that it's an actual constant expression and that it's within a valid range of values. This not only fixes edge case bugs with your approach (like the fact that you can generate a series of literal expressions that result in a value too large to store within an enumerator constant), but it enables new functionality your approach currently disallows (like using constexpr variables instead of just numeric literals). So I don't agree that it's unavoidable to write another custom parser.
205	"Don't let the perfect be the enemy of the good." This is a production compiler toolchain. Correctness is important and that sometimes means caring more about perfection than you otherwise would like to. I'm open to being convinced that it's important, but you haven't convinced me yet :) It's less about importance and more about maintainability coupled with correctness. With your approach, we get something that will have a long tail of bugs. If you used Clang's parser, you don't get the same issue -- maintenance largely comes along for free, and the bugs are far less likely. About the only reason I like your current approach over using clang's parsing is that it quite likely performs much better than doing an actual token parsing of the expression. But as you pointed out, about the worst thing for a check can do is take correct code and make it incorrect -- doing that right requires some amount of semantic evaluation of the expressions (which you're not doing). For example: #define FINE 1LL << 30LL; #define BAD 1LL << 31LL; #define ALSO_BAD 1LL << 32L; We'll convert this into an enumeration and break `-pedantic-errors` builds in C. If we had a `ConstantExpr` object, we could see what it's value is and note that it's greater than what fits into an `int` and decide to do something smarter. So I continue to see the current approach as being somewhat reasonable (especially for experimentation), but incorrect in the long run. Not sufficiently incorrect for me to block this patch on, but incorrect enough that the first time this check becomes a maintenance burden, I'll be asking more strongly to do this the correct way.

In D124500#3485462, @aaron.ballman wrote:

In D124500#3483328, @LegalizeAdulthood wrote:

In D124500#3483224, @aaron.ballman wrote:

To clarify my previous comments, I'm fine punting on the edge cases until user reports come in, so don't let them block this review if you feel strongly about not supporting them. But when user reports start coming in, at some point I might start asking to replace the custom parser with calling into the clangParse library through some invented utility interface so that we don't have to deal with a long tail of bug reports.

Yeah, I think punting on edge cases is the right thing to do here. As I say,
the worst that happens is your macro isn't converted automatically when you
could convert it manually.

I largely agree, but I've found cases where we'll convert correct code to incorrect code, so it's a bit stronger than that.

Are you talking generally, or with this check? I can't see how this check
is going to generate incorrect code (so far).

I think that's a reasonable goal, but we're not meeting the "never ever generate invalid code" part.

How so? Can you give an example where this check will produce invalid code?

As for calling into clangParse, I think that would be overkill for a couple reasons.
First, the real parser is going to do a lot of work creating AST nodes which we will
never use, except to traverse the structure looking for things that would invalidate
it as a candidate for a constant initializing expression. Second, we only need to
match the structure, we don't need to extract any information from the token stream
other than a "thumbs up" or "thumbs down" that it is a valid initializing expression.

There's a few reasons I disagree with this. First, you need to know the value of the
constant expression in order to know whether it's valid as an enumeration constant.

I'm not following you. Nothing requires knowing this yet.

constexpr int a = 12;
constexpr int foo() { return 12; }

#define FOO (a + 1)
#define BAR (a + 2)
#define BAZ (a + 3)
#define QUUX (foo() + 4)

QUUX will never be converted to an enum by this check. It references an identifier foo.

The situations under which it will break correct code should be something we document explicitly

You haven't shown an example yet where it will break code.

LegalizeAdulthood added inline comments.May 2 2022, 9:44 AM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	You keep bringing up the idea that the values have to be known, but so far they don't. Remember, we are replacing macro identifiers with anonymous enum identifiers. We aren't specifying a restricting type to the enum, so as long as it's a valid integral literal expression, we're not changing any semantics. Unscoped enums also allow arbitrary conversions to/from an underlying integral type chosen by the compiler. C++20 9.7.1 paragraph 7 says: For an enumeration whose underlying type is not fixed, the underlying type is an integral type that can represent all the enumerator values defined in the enumeration. If no integral type can represent all the enumerator values, the enumeration is ill-formed. It is implementation-defined which integral type is used as the underlying type except that the underlying type shall not be larger than int unless the value of an enumerator cannot fit in an int or unsigned int . If the enumerator-list is empty, the underlying type is as if the enumeration had a single enumerator with value 0. So the compiler is free to pick an underlying type that's large enough to handle all the explicitly listed initial values. Do we actually need to know the values for this check? I don't think so, because we aren't changing anything about the type of the named values. When the compiler evaluates an integral literal, it goes through a similar algorithm assigning the appropriate type to those integral values: C++20 5.9 paragraph 2: A preprocessing number does not have a type or a value; it acquires both after a successful conversion to an integer-literal token or a floating-point-literal token. C++20 5.13.2 paragraph 3: The type of an integer-literal is the first type in the list in Table 8 corresponding to its optional integer-suffix in which its value can be represented. The table says the type is int, unsigned int, long int, unsigned long int, long long int, or unsigned long long int based on the suffix and the value and that the type is chosen to be big enough to hold the value if the suffix is unspecified. but [using `clangParse`] enables new functionality your approach currently disallows (like using constexpr variables instead of just numeric literals). I agree that if we used the full parser, we'd bring in `constexpr` expressions as valid initializers for the enums. However, before engaging in all that work, I'd like to see how likely this is in existing codebases by feedback from users requesting the support. Maybe engaging the parser isn't a big amount of work, I don't actually know. I've never looked deeply at the actual parsing code in clang. Maybe it's easy enough to throw a bag of tokens at it and get back an AST node, maybe not. (I suspect not based on my experience with the code base so far.) My suspicion is that code bases that are heavy with macros for constants aren't using modern C++ in the body of those macros to define the values of those constants. Certainly this is 100% true for C code that uses macros to define constants, by definition. This check applies equally well to C code as C has had enums forever but even recent C code still tends to use macros for constants. Still, my suspicions aren't data. I'd like to get this check deployed in a basic fashion and let user feedback provide data on what is important. So I don't agree that it's unavoidable to write another custom parser. That's a fair point. Some kind of parser is needed to recognize valid initializer expressions or we run the risk of transforming valid code into invalid code. Whether it is a custom recognizer as I've done or `clangParse` is what we're debating here.
205	"Don't let the perfect be the enemy of the good." This is a production compiler toolchain. Correctness is important and that sometimes means caring more about perfection than you otherwise would like to. That's fair. For example: #define FINE 1LL << 30LL; #define BAD 1LL << 31LL; #define ALSO_BAD 1LL << 32L; Oh this brings up the pesky "semicolons disappear from the AST" issue. I wonder what happens when we're just processing tokens, though. I will add a test to see. This could be a case where my approach results in more correctness than `clangParse`! Not sufficiently incorrect for me to block this patch on, but incorrect enough that the first time this check becomes a maintenance burden, I'll be asking more strongly to do this the correct way. I agree.

LegalizeAdulthood marked 2 inline comments as done.May 2 2022, 9:47 AM

In D124500#3485924, @LegalizeAdulthood wrote:

In D124500#3485462, @aaron.ballman wrote:

I largely agree, but I've found cases where we'll convert correct code to incorrect code, so it's a bit stronger than that.

Are you talking generally, or with this check? I can't see how this check
is going to generate incorrect code (so far).

Specifically this check.

I think that's a reasonable goal, but we're not meeting the "never ever generate invalid code" part.

How so? Can you give an example where this check will produce invalid code?

As posted before:

#define FINE 1LL << 30LL
#define BAD 1LL << 31LL
#define ALSO_BAD 1LL << 32L

Now with godbolt goodness: https://godbolt.org/z/Tzbe8qWT5

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	You keep bringing up the idea that the values have to be known, but so far they don't. See comments at the top level. So the compiler is free to pick an underlying type that's large enough to handle all the explicitly listed initial values. Do we actually need to know the values for this check? Yes, C requires the enumeration constants to be representable with `int`. But also, because this is in the `modernize` module, it's very likely we'll be getting a request to convert to using a scoped enumeration or an enumeration with the appropriate fixed underlying type in C++ as well.

LegalizeAdulthood added inline comments.May 2 2022, 1:11 PM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	Oh, I see now, thanks for explaining it. I didn't realize that C restricts the values to `int`.

aaron.ballman added inline comments.May 2 2022, 1:17 PM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	You're welcome, sorry for not pointing it out sooner!

LegalizeAdulthood added inline comments.May 2 2022, 1:19 PM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	Regarding conversion to a scoped enum, I think that is best handled by a separate enum-to-scoped-enum check. I have one I've been working on separately. As bad as it is to convert macros (since they have no respect for structure or scope), it's quite a bit of work to convert a non-scoped enum to an enum because now implicit conversions enter the picture and expressions involving macros (e.g. `FLAG_X \| FLAG_Y`) also get much more complicated. Not only that but usages have to have types updated. I don't think it's very useful to upgrade to a scoped enum and then have every use wrapped in `static_cast<int>()`. It just creates uglier code than what was there before and I don't think people would adopt such a check. Regarding conversion to an appropriate fixed underlying type, that isn't allowed on unscoped enums, only on scoped enums, so it has all the above complexity plus selecting the appropriate fixed underlying type.

aaron.ballman added inline comments.May 3 2022, 4:49 AM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	Regarding conversion to a scoped enum, I think that is best handled by a separate enum-to-scoped-enum check. It's been a while since I checked, but I recall that checks with interacting fix-its tend not to play well together. We should probably see if that's still the case today. As an example, if the enum-to-scoped-enum check runs BEFORE the modernize-macros-to-enum check, then the behavior will be worse than if the checks are run in the reverse order. Because of issues like that, I'm not quite as convinced that a separate check is best (though I do agree it's notionally better). Regarding conversion to an appropriate fixed underlying type, that isn't allowed on unscoped enums, only on scoped enums, so it has all the above complexity plus selecting the appropriate fixed underlying type. That's incorrect; fixed underlying types and scoped enumerations are orthogonal features (though a scoped enumeration always has a fixed underlying type): https://godbolt.org/z/sGYsjdnrT

LegalizeAdulthood marked an inline comment as done.May 8 2022, 7:24 PM

LegalizeAdulthood added inline comments.

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	You're right that there can be unexpected interactions between checks when you run multiple of them concurrently, but this has always been the case and isn't surprising to me. This doesn't seem to be a situation unique to these checks though. As more and more transformations become available through clang-tidy, it's inevitable that two different checks will want to modify the same piece of code. For instance, the identifier naming check and modernize-loop-convert. Modernize-loop-convert can eliminate variables entirely (iterators go poof), while the identifier check wants to rename the iterators. Huh. OK, good to know. I tried doing an underlying type on an unscoped enum and I got a compilation error; I must have just done it wrong.

OK, so thinking about this review a little more, I propose this:

Take the check as is, but document that the initializing expressions may result in an invalid enum, particularly for C which restricts the underlying type to be int
Create a subsequent commit that rejects the enums where the language is C and the initializing expression is a value larger than an int by rejecting any macro where any integer token in the expression is larger than an int
Create an additional subsequent commit that not only matches the expression but also computes the value and checks it for range.

How does that sound?

In D124500#3499683, @LegalizeAdulthood wrote:

OK, so thinking about this review a little more, I propose this:

Take the check as is, but document that the initializing expressions may result in an invalid enum, particularly for C which restricts the underlying type to be int

Create a subsequent commit that rejects the enums where the language is C and the initializing expression is a value larger than an int by rejecting any macro where any integer token in the expression is larger than an int

Create an additional subsequent commit that not only matches the expression but also computes the value and checks it for range.

How does that sound?

I think that sounds like a good approach -- I expect the third bullet is when we'll likely learn whether we should have let Clang do the parsing or not (because then we're replicating not only the expression parsing but the constant evaluation calculations from the frontend).

I re-reviewed the patch as it stands today, and given the above plan, I think this is good to go. So it gets my LGTM and my thanks for the good discussions!

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	You're right that there can be unexpected interactions between checks when you run multiple of them concurrently, but this has always been the case and isn't surprising to me. +1, this isn't a new issue. The reason I brought it up is because we've been bringing up this issue for a few years now and nobody has had the chance to try to fix the fixit infrastructure to improve the behavior of these kinds of interactions. So my fear is that we keep making the situation incrementally worse, and then it gets incrementally harder for anyone to fix it because of odd edge case behavior. That's not a reason for you to change what you're doing in this patch right now, though -- just background on where I'm coming from.
clang-tools-extra/docs/clang-tidy/checks/modernize-macro-to-enum.rst
21–22	Maybe we should also list the binary `?:` GNU extension and comma expressions?

This revision is now accepted and ready to land.May 9 2022, 4:22 AM

LegalizeAdulthood added inline comments.May 9 2022, 8:29 AM

clang-tools-extra/docs/clang-tidy/checks/modernize-macro-to-enum.rst
21–22	Yes, I need to update the docs on that and also call out the potential false positives explicitly.

LegalizeAdulthood added inline comments.May 9 2022, 8:32 AM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
100	I've already observed that `run-clang-tidy.py` can produce invalid fixits for header files, see bug 54885 and this discussion. I haven't yet concluded if it's a bug in the way I'm emitting fixits or a bug in the way `clang-apply-replacements` tries to de-duplicate fixits.

LegalizeAdulthood marked 3 inline comments as done.May 13 2022, 12:47 PM

Update documentation from review comments

Harbormaster completed remote builds in B164392: Diff 429353.May 13 2022, 3:35 PM

This revision was landed with ongoing or failed builds.May 13 2022, 5:46 PM

Closed by commit rG512273833136: [clang-tidy] Support expressions of literals in modernize-macro-to-enum (authored by LegalizeAdulthood). · Explain Why

This revision was automatically updated to reflect the committed changes.

LegalizeAdulthood added a commit: rG512273833136: [clang-tidy] Support expressions of literals in modernize-macro-to-enum.

aaronpuchert added a subscriber: aaronpuchert.May 14 2022, 4:06 AM

aaronpuchert added inline comments.

clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

210–214

Seems to have caused a build failure:

FAILED: tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o 
/home/buildbots/clang.11.0.0/bin/clang++ --gcc-toolchain=/opt/rh/devtoolset-7/root/usr  -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/clang/tools/extra/unittests/clang-tidy -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include -Itools/clang/include -Iinclude -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clang-tidy -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googletest/include -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googlemock/include -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG    -Wno-variadic-macros -Wno-gnu-zero-variadic-macro-arguments -fno-exceptions -fno-rtti -UNDEBUG -Wno-suggest-override -std=c++14 -MD -MT tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o -MF tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o.d -o tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o -c /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210:15: error: unused function 'operator<<' [-Werror,-Wunused-function]
std::ostream &operator<<(std::ostream &Str,
              ^
1 error generated.

RKSimon mentioned this in rGffacaa0beccb: Fix unused function 'operator<<' -Wunused-function warning introduced in D124500.May 14 2022, 5:48 AM

LegalizeAdulthood added inline comments.May 14 2022, 11:27 AM

clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
210–214	Simon Pilgrim fixed it, but I don't understand why clang calls this function unused. When the test fails, gtest uses this function to pretty print the parameter. I'm rebuilding with a forced test failure to validate.

LegalizeAdulthood added inline comments.May 14 2022, 11:50 AM

clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

210–214

Yes, without this function the failing test prints results like this:

[ RUN      ] TokenExpressionParserTests/MatcherTest.MatchResult/123
D:\legalize\llvm\llvm-project\clang-tools-extra\unittests\clang-tidy\ModernizeModuleTest.cpp(200): error: Value of: matchText(GetParam().Text) == GetParam().Matched
  Actual: false
Expected: true
[  FAILED  ] TokenExpressionParserTests/MatcherTest.MatchResult/123, where GetParam() = 16-byte object <00-00 00-00 00-00 00-00 40-EC 3B-D6 F6-7F 00-00> (1 ms)

....which isn't particularly useful.

So how do we include pretty printers for tests without clang erroneously flagging them as unused?

aaronpuchert added inline comments.May 14 2022, 1:45 PM

clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
210–214	What got me wondering: this definition is last in the file, and there is no prior declaration of this function. How can there be any uses of it? We're not in class scope, so all prior uses of `operator<<` or perhaps `PrintTo` must have been resolved to some other function already. Or am I missing something?

LegalizeAdulthood marked an inline comment as done.May 14 2022, 1:47 PM

LegalizeAdulthood added inline comments.

clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
210–214	It seems what other tests do is define a friend function in the parameter class. I'm going to push that and see if that is accepted.

LegalizeAdulthood marked an inline comment as done.May 14 2022, 1:48 PM

LegalizeAdulthood added inline comments.

clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
210–214	Could have been an MSVC-ism, which is what I develop with. It was most definitely being used while I was working on the test cases.

LegalizeAdulthood marked an inline comment as done.May 14 2022, 5:28 PM

LegalizeAdulthood added inline comments.

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
205	So I was research the C standard for what it says are acceptable initializer values for an enum and it disallows the comma operator: https://en.cppreference.com/w/c/language/enum integer constant expression whose value is representable as a value of type int https://en.cppreference.com/w/c/language/constant_expression An integer constant expression is an expression that consists only of operators other than assignment, increment, decrement, function-call, or comma, except that cast operators can only cast arithmetic types to integer types So I'll have to reject initializing expressions that use the comma operator when processing C files.

aaron.ballman added inline comments.May 16 2022, 4:33 AM

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
205	So I'll have to reject initializing expressions that use the comma operator when processing C files. Be careful when you do so: https://godbolt.org/z/dTMKv3a4v

Revision Contents

Path

Size

clang-tools-extra/

clang-tidy/

modernize/

CMakeLists.txt

1 line

IntegralLiteralExpressionMatcher.h

73 lines

IntegralLiteralExpressionMatcher.cpp

232 lines

MacroToEnumCheck.cpp

78 lines

docs/

clang-tidy/

checks/

modernize-macro-to-enum.rst

27 lines

test/

clang-tidy/

checkers/

modernize-macro-to-enum.cpp

43 lines

unittests/

clang-tidy/

CMakeLists.txt

2 lines

ModernizeModuleTest.cpp

214 lines

Diff 429395

clang-tools-extra/clang-tidy/modernize/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	FrontendOpenMP			FrontendOpenMP
	Support			Support
	)			)

	add_clang_library(clangTidyModernizeModule			add_clang_library(clangTidyModernizeModule
	AvoidBindCheck.cpp			AvoidBindCheck.cpp
	AvoidCArraysCheck.cpp			AvoidCArraysCheck.cpp
	ConcatNestedNamespacesCheck.cpp			ConcatNestedNamespacesCheck.cpp
	DeprecatedHeadersCheck.cpp			DeprecatedHeadersCheck.cpp
	DeprecatedIosBaseAliasesCheck.cpp			DeprecatedIosBaseAliasesCheck.cpp
				IntegralLiteralExpressionMatcher.cpp
	LoopConvertCheck.cpp			LoopConvertCheck.cpp
	LoopConvertUtils.cpp			LoopConvertUtils.cpp
	MacroToEnumCheck.cpp			MacroToEnumCheck.cpp
	MakeSharedCheck.cpp			MakeSharedCheck.cpp
	MakeSmartPtrCheck.cpp			MakeSmartPtrCheck.cpp
	MakeUniqueCheck.cpp			MakeUniqueCheck.cpp
	ModernizeTidyModule.cpp			ModernizeTidyModule.cpp
	PassByValueCheck.cpp			PassByValueCheck.cpp
	Show All 40 Lines

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h

This file was added.

				//===--- IntegralLiteralExpressionMatcher.h - clang-tidy ------------------===//
				//
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions Uh.... I guess this needs some sort of copyright notice? LegalizeAdulthood: Uh.... I guess this needs some sort of copyright notice?
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_MODERNIZE_INTEGRAL_LITERAL_EXPRESSION_MATCHER_H
				#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_MODERNIZE_INTEGRAL_LITERAL_EXPRESSION_MATCHER_H
				aaron.ballmanUnsubmitted Done Reply Inline Actions We don't use #pragma once (not portable, not reliable). aaron.ballman: We don't use #pragma once (not portable, not reliable).

				#include <clang/Lex/Token.h>
				#include <llvm/ADT/ArrayRef.h>

				namespace clang {
				namespace tidy {
				namespace modernize {

				// Parses an array of tokens and returns true if they conform to the rules of
				// C++ for whole expressions involving integral literals. Follows the operator
				// precedence rules of C++.
				aaron.ballmanUnsubmitted Done Reply Inline Actions Oh boy, I'm not super excited about having another parser to maintain... It'd be nice if we had a ParserUtils.cpp/h file that made it easier to go from an arbitrary array of tokens to AST nodes + success/failure information on parsing the tokens. It's not strictly needed for what you're trying to accomplish here, but it would be a much more general interface and it would remove the support burden from adding another parser that's out of Clang's tree. aaron.ballman: Oh boy, I'm not super excited about having another parser to maintain... It'd be nice if we…
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions Yeah, I'm not a fan of duplication either, but see my earlier comments about why I think clangParse is overkill here. LegalizeAdulthood: Yeah, I'm not a fan of duplication either, but see my earlier comments about why I think…
				class IntegralLiteralExpressionMatcher {
				public:
				IntegralLiteralExpressionMatcher(ArrayRef<Token> Tokens)
				: Current(Tokens.begin()), End(Tokens.end()) {}

				bool match();

				private:
				bool advance();
				bool consume(tok::TokenKind Kind);
				bool nonTerminalChainedExpr(
				bool (IntegralLiteralExpressionMatcher::*NonTerminal)(),
				const std::function<bool(Token)> &IsKind);
				template <tok::TokenKind Kind>
				bool nonTerminalChainedExpr(
				bool (IntegralLiteralExpressionMatcher::*NonTerminal)()) {
				return nonTerminalChainedExpr(NonTerminal,
				[](Token Tok) { return Tok.is(Kind); });
				}
				template <tok::TokenKind K1, tok::TokenKind K2, tok::TokenKind... Ks>
				bool nonTerminalChainedExpr(
				bool (IntegralLiteralExpressionMatcher::*NonTerminal)()) {
				return nonTerminalChainedExpr(
				NonTerminal, [](Token Tok) { return Tok.isOneOf(K1, K2, Ks...); });
				}

				bool unaryOperator();
				bool unaryExpr();
				bool multiplicativeExpr();
				bool additiveExpr();
				bool shiftExpr();
				bool compareExpr();
				bool relationalExpr();
				bool equalityExpr();
				bool andExpr();
				bool exclusiveOrExpr();
				bool inclusiveOrExpr();
				bool logicalAndExpr();
				bool logicalOrExpr();
				bool conditionalExpr();
				bool commaExpr();
				bool expr();

				ArrayRef<Token>::iterator Current;
				ArrayRef<Token>::iterator End;
				};

				} // namespace modernize
				} // namespace tidy
				} // namespace clang

				#endif

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp

This file was added.

//===--- IntegralLiteralExpressionMatcher.cpp - clang-tidy ----------------===//

LegalizeAdulthoodAuthorUnsubmitted

Done

ditto

LegalizeAdulthood: ditto

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

#include "IntegralLiteralExpressionMatcher.h"

#include <cctype>

#include <stdexcept>

namespace clang {

namespace tidy {

namespace modernize {

// Validate that this literal token is a valid integer literal. A literal token

// could be a floating-point token, which isn't acceptable as a value for an

// enumeration. A floating-point token must either have a decimal point or an

// exponent ('E' or 'P').

static bool isIntegralConstant(const Token &Token) {

const char *Begin = Token.getLiteralData();

const char *End = Begin + Token.getLength();

// Not a hexadecimal floating-point literal.

aaron.ballmanUnsubmitted

Done

const char *End = Begin + Token.getLength();

- // not a hexadecimal floating-point literal

+ // Not a hexadecimal floating-point literal.

if (Token.getLength() > 2 && Begin[0] == '0' && std::toupper(Begin[1]) == 'X')

(Same suggestion elsewhere -- just double check that all the comments are full sentences with capitalization and punctuation.)

aaron.ballman: (Same suggestion elsewhere -- just double check that all the comments are full sentences with…

if (Token.getLength() > 2 && Begin[0] == '0' && std::toupper(Begin[1]) == 'X')

return std::none_of(Begin + 2, End, [](char C) {

return C == '.' || std::toupper(C) == 'P';

});

// Not a decimal floating-point literal or complex literal.

return std::none_of(Begin, End, [](char C) {

return C == '.' || std::toupper(C) == 'E' || std::toupper(C) == 'I';

aaron.ballmanUnsubmitted

Done

Do we need to care about integer suffixes that make a non-integer type, like: https://godbolt.org/z/vx3xbGa41

aaron.ballman: Do we need to care about integer suffixes that make a non-integer type, like: https://godbolt.

LegalizeAdulthoodAuthorUnsubmitted

Done

I don't think those will be parsed as literal tokens by the preprocessor, but I'll check.

LegalizeAdulthood: I don't think those will be parsed as literal tokens by the preprocessor, but I'll check.

});

}

bool IntegralLiteralExpressionMatcher::advance() {

++Current;

return Current != End;

}

bool IntegralLiteralExpressionMatcher::consume(tok::TokenKind Kind) {

if (Current->is(Kind)) {

++Current;

return true;

}

return false;

}

bool IntegralLiteralExpressionMatcher::nonTerminalChainedExpr(

bool (IntegralLiteralExpressionMatcher::*NonTerminal)(),

const std::function<bool(Token)> &IsKind) {

if (!(this->*NonTerminal)())

return false;

if (Current == End)

return true;

while (Current != End) {

if (!IsKind(*Current))

break;

if (!advance())

return false;

if (!(this->*NonTerminal)())

return false;

}

return true;

}

// Advance over unary operators.

bool IntegralLiteralExpressionMatcher::unaryOperator() {

if (Current->isOneOf(tok::TokenKind::minus, tok::TokenKind::plus,

tok::TokenKind::tilde, tok::TokenKind::exclaim)) {

return advance();

}

return true;

}

bool IntegralLiteralExpressionMatcher::unaryExpr() {

if (!unaryOperator())

return false;

if (consume(tok::TokenKind::l_paren)) {

if (Current == End)

return false;

if (!expr())

return false;

if (Current == End)

return false;

return consume(tok::TokenKind::r_paren);

}

aaron.ballmanUnsubmitted

Done

I know this is code moved from elsewhere, but I suppose we never considered the odd edge case where a user does something like "foo"[0] as a really awful integer constant. :-D

aaron.ballman: I know this is code moved from elsewhere, but I suppose we never considered the odd edge case…

LegalizeAdulthoodAuthorUnsubmitted

Done

It's always possible to write crazy contorted code and have a check not recognize it. I don't think it's worthwhile to spend time trying to handle torture cases unless we can find data that shows it's prevalent in real world code.

If I was doing a code review and saw this:

enum {
    FOO = "foo"[0]
};

I'd flag that in a code review as bogus, whereas if I saw:

enum {
    FOO = 'f'
};

That would be acceptable, which is why character literals are accepted as an integral literal initializer for an enum in this check.

LegalizeAdulthood: It's always possible to write crazy contorted code and have a check not recognize it. I don't…

aaron.ballmanUnsubmitted

Done

I don't think it's worthwhile to spend time trying to handle torture cases unless we can find data that shows it's prevalent in real world code.

I think I'm okay agreeing to that in this particular case, but this is more to point out that writing your own parser is a maintenance burden. Users will hit cases we've both forgotten about here, they'll file a bug, then someone has to deal with it. It's very hard to justify to users "we think you write silly code" because they often have creative ways in which their code is not actually so silly, especially when we support "most" valid expressions.

aaron.ballman: > I don't think it's worthwhile to spend time trying to handle torture cases unless we can…

LegalizeAdulthoodAuthorUnsubmitted

Done

Writing your own parser is unavoidable here because we can't just assume that any old thing will be a valid initializer just by looking at the set of tokens present in the macro body. (There is a separate discussion going on about improving the preprocessor support and parsing things more deeply, but that isn't even to the point of a prototype yet.) The worst thing we can do is create "fixits" that produce invalid code.

The worst that happens if your expression isn't recognized is that your macro isn't matched as a candidate for an enum. You can always make it an enum manually and join it with adjacent macros that were recognized and converted.

As it stands, the check only recognizes a single literal with an optional unary operator.

This change expands the check to recognize a broad range of expressions, allowing those macros to be converted to enums. I opened the issue because running modernize-macro-to-enum on the ITK codebase showed some simple expressions involving literals that weren't recognized and converted.

If an expression isn't recognized and an issue is opened, it will be an enhancement request to support a broader range of expression, not a bug that this check created invalid code.

IMO, the more useful thing that's missing from the grammar is recognizing sizeof expressions rather than indexing string literals with an integral literal subscript.

I had planned on doing another increment to recognize sizeof expressions.

LegalizeAdulthood: Writing your own parser is unavoidable here because we can't just assume that any old thing…

aaron.ballmanUnsubmitted

Done

Writing your own parser is unavoidable here because we can't just assume that any old thing will be a valid initializer just by looking at the set of tokens present in the macro body.

If you ran the token sequence through clang's parser and got an AST node out, you'd have significantly *more* information as to whether something is a valid enum constant initializer because you can check that it's an actual constant expression *and* that it's within a valid range of values. This not only fixes edge case bugs with your approach (like the fact that you can generate a series of literal expressions that result in a value too large to store within an enumerator constant), but it enables new functionality your approach currently disallows (like using constexpr variables instead of just numeric literals).

So I don't agree that it's unavoidable to write another custom parser.

aaron.ballman: > Writing your own parser is unavoidable here because we can't just assume that any old thing…

LegalizeAdulthoodAuthorUnsubmitted

Done

You keep bringing up the idea that the values have to be known, but so far they don't.

Remember, we are replacing macro identifiers with anonymous enum identifiers. We aren't specifying a restricting type to the enum, so as long as it's a valid integral literal expression, we're not changing any semantics. Unscoped enums also allow arbitrary conversions to/from an underlying integral type chosen by the compiler.

C++20 9.7.1 paragraph 7 says:

For an enumeration whose underlying type is not fixed, the underlying type is an integral type that can
represent all the enumerator values defined in the enumeration. If no integral type can represent all the
enumerator values, the enumeration is ill-formed. It is implementation-defined which integral type is used
as the underlying type except that the underlying type shall not be larger than int unless the value of an
enumerator cannot fit in an int or unsigned int . If the enumerator-list is empty, the underlying type is as
if the enumeration had a single enumerator with value 0.

So the compiler is free to pick an underlying type that's large enough to handle all the explicitly listed initial values. Do we actually need to know the values for this check? I don't think so, because we aren't changing anything about the type of the named values. When the compiler evaluates an integral literal, it goes through a similar algorithm assigning the appropriate type to those integral values:

C++20 5.9 paragraph 2:

A preprocessing number does not have a type or a value; it acquires both after a successful conversion to an
integer-literal token or a floating-point-literal token.

C++20 5.13.2 paragraph 3:

The type of an integer-literal is the first type in the list in Table 8 corresponding to its optional integer-suffix
in which its value can be represented.

The table says the type is int, unsigned int, long int, unsigned long int, long long int, or unsigned long long int based on the suffix and the value and that the type is chosen to be big enough to hold the value if the suffix is unspecified.

but [using clangParse] enables new functionality your approach currently disallows (like using constexpr variables instead of just numeric literals).

I agree that if we used the full parser, we'd bring in constexpr expressions as valid initializers for the enums. However, before engaging in all that work, I'd like to see how likely this is in existing codebases by feedback from users requesting the support. Maybe engaging the parser isn't a big amount of work, I don't actually know. I've never looked deeply at the actual parsing code in clang. Maybe it's easy enough to throw a bag of tokens at it and get back an AST node, maybe not. (I suspect not based on my experience with the code base so far.)

My suspicion is that code bases that are heavy with macros for constants aren't using modern C++ in the body of those macros to define the values of those constants. Certainly this is 100% true for C code that uses macros to define constants, by definition. This check applies equally well to C code as C has had enums forever but even recent C code still tends to use macros for constants.

Still, my suspicions aren't data. I'd like to get this check deployed in a basic fashion and let user feedback provide data on what is important.

So I don't agree that it's unavoidable to write another custom parser.

That's a fair point. Some kind of parser is needed to recognize valid initializer expressions or we run the risk of transforming valid code into invalid code. Whether it is a custom recognizer as I've done or clangParse is what we're debating here.

LegalizeAdulthood: You keep bringing up the idea that the values have to be known, but so far they don't.

aaron.ballmanUnsubmitted

Done

You keep bringing up the idea that the values have to be known, but so far they don't.

See comments at the top level.

So the compiler is free to pick an underlying type that's large enough to handle all the explicitly listed initial values. Do we actually need to know the values for this check?

Yes, C requires the enumeration constants to be representable with int. But also, because this is in the modernize module, it's very likely we'll be getting a request to convert to using a scoped enumeration or an enumeration with the appropriate fixed underlying type in C++ as well.

aaron.ballman: > You keep bringing up the idea that the values have to be known, but so far they don't. See…

LegalizeAdulthoodAuthorUnsubmitted

Done

Oh, I see now, thanks for explaining it. I didn't realize that C restricts the values to int.

LegalizeAdulthood: Oh, I see now, thanks for explaining it. I didn't realize that C restricts the values to `int`.

aaron.ballmanUnsubmitted

Done

You're welcome, sorry for not pointing it out sooner!

aaron.ballman: You're welcome, sorry for not pointing it out sooner!

LegalizeAdulthoodAuthorUnsubmitted

Done

Regarding conversion to a scoped enum, I think that is best handled by a separate enum-to-scoped-enum check. I have one I've been working on separately. As bad as it is to convert macros (since they have no respect for structure or scope), it's quite a bit of work to convert a non-scoped enum to an enum because now implicit conversions enter the picture and expressions involving macros (e.g. FLAG_X | FLAG_Y) also get much more complicated. Not only that but usages have to have types updated. I don't think it's very useful to upgrade to a scoped enum and then have every use wrapped in static_cast<int>(). It just creates uglier code than what was there before and I don't think people would adopt such a check.

Regarding conversion to an appropriate fixed underlying type, that isn't allowed on unscoped enums, only on scoped enums, so it has all the above complexity plus selecting the appropriate fixed underlying type.

LegalizeAdulthood: Regarding conversion to a scoped enum, I think that is best handled by a separate enum-to…

aaron.ballmanUnsubmitted

Done

Regarding conversion to a scoped enum, I think that is best handled by a separate enum-to-scoped-enum check.

It's been a while since I checked, but I recall that checks with interacting fix-its tend not to play well together. We should probably see if that's still the case today. As an example, if the enum-to-scoped-enum check runs BEFORE the modernize-macros-to-enum check, then the behavior will be worse than if the checks are run in the reverse order. Because of issues like that, I'm not quite as convinced that a separate check is best (though I do agree it's notionally better).

Regarding conversion to an appropriate fixed underlying type, that isn't allowed on unscoped enums, only on scoped enums, so it has all the above complexity plus selecting the appropriate fixed underlying type.

That's incorrect; fixed underlying types and scoped enumerations are orthogonal features (though a scoped enumeration always has a fixed underlying type): https://godbolt.org/z/sGYsjdnrT

aaron.ballman: > Regarding conversion to a scoped enum, I think that is best handled by a separate enum-to…

LegalizeAdulthoodAuthorUnsubmitted

Done

You're right that there can be unexpected interactions between checks when you run multiple of them concurrently, but this has always been the case and isn't surprising to me. This doesn't seem to be a situation unique to these checks though. As more and more transformations become available through clang-tidy, it's inevitable that two different checks will want to modify the same piece of code. For instance, the identifier naming check and modernize-loop-convert. Modernize-loop-convert can eliminate variables entirely (iterators go poof), while the identifier check wants to rename the iterators.

Huh. OK, good to know. I tried doing an underlying type on an unscoped enum and I got a compilation error; I must have just done it wrong.

LegalizeAdulthood: You're right that there can be unexpected interactions between checks when you run multiple of…

aaron.ballmanUnsubmitted

Done

You're right that there can be unexpected interactions between checks when you run multiple of them concurrently, but this has always been the case and isn't surprising to me.

+1, this isn't a new issue. The reason I brought it up is because we've been bringing up this issue for a few years now and nobody has had the chance to try to fix the fixit infrastructure to improve the behavior of these kinds of interactions. So my fear is that we keep making the situation incrementally worse, and then it gets incrementally harder for anyone to fix it because of odd edge case behavior. That's not a reason for you to change what you're doing in this patch right now, though -- just background on where I'm coming from.

aaron.ballman: > You're right that there can be unexpected interactions between checks when you run multiple…

LegalizeAdulthoodAuthorUnsubmitted

Done

I've already observed that run-clang-tidy.py can produce invalid fixits for header files, see bug 54885 and this discussion. I haven't yet concluded if it's a bug in the way I'm emitting fixits or a bug in the way clang-apply-replacements tries to de-duplicate fixits.

LegalizeAdulthood: I've already observed that `run-clang-tidy.py` can produce invalid fixits for header files, see…

if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||

!isIntegralConstant(*Current)) {

return false;

}

++Current;

return true;

}

bool IntegralLiteralExpressionMatcher::multiplicativeExpr() {

return nonTerminalChainedExpr<tok::TokenKind::star, tok::TokenKind::slash,

tok::TokenKind::percent>(

&IntegralLiteralExpressionMatcher::unaryExpr);

}

bool IntegralLiteralExpressionMatcher::additiveExpr() {

return nonTerminalChainedExpr<tok::plus, tok::minus>(

&IntegralLiteralExpressionMatcher::multiplicativeExpr);

}

bool IntegralLiteralExpressionMatcher::shiftExpr() {

return nonTerminalChainedExpr<tok::TokenKind::lessless,

tok::TokenKind::greatergreater>(

&IntegralLiteralExpressionMatcher::additiveExpr);

}

bool IntegralLiteralExpressionMatcher::compareExpr() {

if (!shiftExpr())

return false;

if (Current == End)

return true;

if (Current->is(tok::TokenKind::spaceship)) {

if (!advance())

return false;

if (!shiftExpr())

return false;

}

return true;

}

bool IntegralLiteralExpressionMatcher::relationalExpr() {

return nonTerminalChainedExpr<tok::TokenKind::less, tok::TokenKind::greater,

tok::TokenKind::lessequal,

tok::TokenKind::greaterequal>(

&IntegralLiteralExpressionMatcher::compareExpr);

}

bool IntegralLiteralExpressionMatcher::equalityExpr() {

return nonTerminalChainedExpr<tok::TokenKind::equalequal,

tok::TokenKind::exclaimequal>(

&IntegralLiteralExpressionMatcher::relationalExpr);

}

bool IntegralLiteralExpressionMatcher::andExpr() {

return nonTerminalChainedExpr<tok::TokenKind::amp>(

&IntegralLiteralExpressionMatcher::equalityExpr);

}

bool IntegralLiteralExpressionMatcher::exclusiveOrExpr() {

return nonTerminalChainedExpr<tok::TokenKind::caret>(

&IntegralLiteralExpressionMatcher::andExpr);

}

bool IntegralLiteralExpressionMatcher::inclusiveOrExpr() {

return nonTerminalChainedExpr<tok::TokenKind::pipe>(

&IntegralLiteralExpressionMatcher::exclusiveOrExpr);

}

bool IntegralLiteralExpressionMatcher::logicalAndExpr() {

return nonTerminalChainedExpr<tok::TokenKind::ampamp>(

&IntegralLiteralExpressionMatcher::inclusiveOrExpr);

}

bool IntegralLiteralExpressionMatcher::logicalOrExpr() {

return nonTerminalChainedExpr<tok::TokenKind::pipepipe>(

&IntegralLiteralExpressionMatcher::logicalAndExpr);

}

bool IntegralLiteralExpressionMatcher::conditionalExpr() {

if (!logicalOrExpr())

return false;

if (Current == End)

return true;

aaron.ballmanUnsubmitted

Done

There is GNU extension in this space: https://godbolt.org/z/PrWY3T6hY

aaron.ballman: There is GNU extension in this space: https://godbolt.org/z/PrWY3T6hY

LegalizeAdulthoodAuthorUnsubmitted

Done

Do you have a link for the extension?

LegalizeAdulthood: Do you have a link for the extension?

aaron.ballmanUnsubmitted

Done

https://gcc.gnu.org/onlinedocs/gcc/Conditionals.html

aaron.ballman: https://gcc.gnu.org/onlinedocs/gcc/Conditionals.html

if (Current->is(tok::TokenKind::question)) {

if (!advance())

return false;

// A gcc extension allows x ? : y as a synonym for x ? x : y.

if (Current->is(tok::TokenKind::colon)) {

if (!advance())

return false;

if (!expr())

return false;

return true;

}

if (!expr())

return false;

if (Current == End)

return false;

aaron.ballmanUnsubmitted

Done

Comma operator?

aaron.ballman: Comma operator?

LegalizeAdulthoodAuthorUnsubmitted

Done

Remember that the use case here is identifying expressions that are initializers for an enum. If you were doing a code review and you saw this:

enum {
    FOO = (2, 3)
};

Would you be OK with that? I wouldn't. Clang even warns about it: https://godbolt.org/z/Y641cb8Y9

Therefore I deliberately left comma operator out of the grammar.

LegalizeAdulthood: Remember that the use case here is identifying expressions that are initializers for an enum.

aaron.ballmanUnsubmitted

Done

This is another case where I think you're predicting that users won't be using the full expressivity of the language and we'll get bug reports later. Again, in insolation, I tend to agree that I wouldn't be happy seeing that code. However, users write some very creative code and there's no technical reason why we can't or shouldn't handle comma operators.

aaron.ballman: This is another case where I think you're predicting that users won't be using the full…

LegalizeAdulthoodAuthorUnsubmitted

Done

"Don't let the perfect be the enemy of the good."

My inclination is to simply explicitly state that comma operator is not recognized in the documentation. It's already implicit by it's absence from the list of recognized operators.

Again, the worst that happens is that your macro isn't converted.

I'm open to being convinced that it's important, but you haven't convinced me yet :)

LegalizeAdulthood: "Don't let the perfect be the enemy of the good." My inclination is to simply explicitly state…

aaron.ballmanUnsubmitted

Done

"Don't let the perfect be the enemy of the good."

This is a production compiler toolchain. Correctness is important and that sometimes means caring more about perfection than you otherwise would like to.

I'm open to being convinced that it's important, but you haven't convinced me yet :)

It's less about importance and more about maintainability coupled with correctness. With your approach, we get something that will have a long tail of bugs. If you used Clang's parser, you don't get the same issue -- maintenance largely comes along for free, and the bugs are far less likely.

About the only reason I like your current approach over using clang's parsing is that it quite likely performs much better than doing an actual token parsing of the expression. But as you pointed out, about the worst thing for a check can do is take correct code and make it incorrect -- doing that right requires some amount of semantic evaluation of the expressions (which you're not doing). For example:

#define FINE 1LL << 30LL;
#define BAD 1LL << 31LL;
#define ALSO_BAD 1LL << 32L;

We'll convert this into an enumeration and break -pedantic-errors builds in C. If we had a ConstantExpr object, we could see what it's value is and note that it's greater than what fits into an int and decide to do something smarter.

So I continue to see the current approach as being somewhat reasonable (especially for experimentation), but incorrect in the long run. Not sufficiently incorrect for me to block this patch on, but incorrect enough that the first time this check becomes a maintenance burden, I'll be asking more strongly to do this the correct way.

aaron.ballman: > "Don't let the perfect be the enemy of the good." This is a production compiler toolchain.

LegalizeAdulthoodAuthorUnsubmitted

Done

"Don't let the perfect be the enemy of the good."

This is a production compiler toolchain. Correctness is important and that sometimes means caring more about perfection than you otherwise would like to.

That's fair.

For example:

#define FINE 1LL << 30LL;
#define BAD 1LL << 31LL;
#define ALSO_BAD 1LL << 32L;

Oh this brings up the pesky "semicolons disappear from the AST" issue. I wonder what happens when we're just processing tokens, though. I will add a test to see. This could be a case where my approach results in more correctness than clangParse!

Not sufficiently incorrect for me to block this patch on, but incorrect enough that the first time this check becomes a maintenance burden, I'll be asking more strongly to do this the correct way.

I agree.

LegalizeAdulthood: > > "Don't let the perfect be the enemy of the good." > > This is a production compiler…

LegalizeAdulthoodAuthorUnsubmitted

Done

It wasn't much extra work/code to add comma operator support so I've done that.

LegalizeAdulthood: It wasn't much extra work/code to add comma operator support so I've done that.

LegalizeAdulthoodAuthorUnsubmitted

Done

So I was research the C standard for what it says are acceptable initializer values for an enum and it disallows the comma operator:

https://en.cppreference.com/w/c/language/enum

integer constant expression whose value is representable as a value of type int

https://en.cppreference.com/w/c/language/constant_expression

An integer constant expression is an expression that consists only of

operators other than assignment, increment, decrement, function-call, or comma, except that cast operators can only cast arithmetic types to integer types

So I'll have to reject initializing expressions that use the comma operator when processing C files.

LegalizeAdulthood: So I was research the C standard for what it says are acceptable initializer values for an enum…

aaron.ballmanUnsubmitted

Not Done

So I'll have to reject initializing expressions that use the comma operator when processing C files.

Be careful when you do so: https://godbolt.org/z/dTMKv3a4v

aaron.ballman: > So I'll have to reject initializing expressions that use the comma operator when processing C…

if (!Current->is(tok::TokenKind::colon))

return false;

if (!advance())

return false;

if (!expr())

return false;

}

return true;

}

bool IntegralLiteralExpressionMatcher::commaExpr() {

return nonTerminalChainedExpr<tok::TokenKind::comma>(

&IntegralLiteralExpressionMatcher::conditionalExpr);

}

bool IntegralLiteralExpressionMatcher::expr() { return commaExpr(); }

bool IntegralLiteralExpressionMatcher::match() {

return expr() && Current == End;

}

} // namespace modernize

} // namespace tidy

} // namespace clang

LegalizeAdulthoodAuthorUnsubmitted

Done

Structure of these feels very similar, I'll see if I can squish out the duplication

LegalizeAdulthood: Structure of these feels very similar, I'll see if I can squish out the duplication

clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp

//===--- MacroToEnumCheck.cpp - clang-tidy --------------------------------===//		//===--- MacroToEnumCheck.cpp - clang-tidy --------------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "MacroToEnumCheck.h"		#include "MacroToEnumCheck.h"
		#include "IntegralLiteralExpressionMatcher.h"

#include "clang/AST/ASTContext.h"		#include "clang/AST/ASTContext.h"
#include "clang/ASTMatchers/ASTMatchFinder.h"		#include "clang/ASTMatchers/ASTMatchFinder.h"
#include "clang/Lex/Preprocessor.h"		#include "clang/Lex/Preprocessor.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cctype>		#include <cctype>
#include <string>		#include <string>
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	default:
State = WhiteSpace::Nothing;		State = WhiteSpace::Nothing;
break;		break;
}		}
}		}

return true;		return true;
}		}

// Validate that this literal token is a valid integer literal. A literal token
// could be a floating-point token, which isn't acceptable as a value for an
// enumeration. A floating-point token must either have a decimal point or an
// exponent ('E' or 'P').
static bool isIntegralConstant(const Token &Token) {
const char *Begin = Token.getLiteralData();
const char *End = Begin + Token.getLength();

// not a hexadecimal floating-point literal
if (Token.getLength() > 2 && Begin[0] == '0' && std::toupper(Begin[1]) == 'X')
return std::none_of(Begin + 2, End, [](char C) {
return C == '.' \|\| std::toupper(C) == 'P';
});

// not a decimal floating-point literal
return std::none_of(
Begin, End, [](char C) { return C == '.' \|\| std::toupper(C) == 'E'; });
}

static StringRef getTokenName(const Token &Tok) {		static StringRef getTokenName(const Token &Tok) {
return Tok.is(tok::raw_identifier) ? Tok.getRawIdentifier()		return Tok.is(tok::raw_identifier) ? Tok.getRawIdentifier()
: Tok.getIdentifierInfo()->getName();		: Tok.getIdentifierInfo()->getName();
}		}

namespace {		namespace {

struct EnumMacro {		struct EnumMacro {
▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	private:
void checkCondition(SourceRange ConditionRange);		void checkCondition(SourceRange ConditionRange);
void checkName(const Token &MacroNameTok);		void checkName(const Token &MacroNameTok);
void rememberExpressionName(const Token &Tok);		void rememberExpressionName(const Token &Tok);
void rememberExpressionTokens(ArrayRef<Token> MacroTokens);		void rememberExpressionTokens(ArrayRef<Token> MacroTokens);
void invalidateExpressionNames();		void invalidateExpressionNames();
void issueDiagnostics();		void issueDiagnostics();
void warnMacroEnum(const EnumMacro &Macro) const;		void warnMacroEnum(const EnumMacro &Macro) const;
void fixEnumMacro(const MacroList &MacroList) const;		void fixEnumMacro(const MacroList &MacroList) const;
		bool isInitializer(ArrayRef<Token> MacroTokens);

MacroToEnumCheck *Check;		MacroToEnumCheck *Check;
const LangOptions &LangOpts;		const LangOptions &LangOpts;
const SourceManager &SM;		const SourceManager &SM;
SmallVector<MacroList> Enums;		SmallVector<MacroList> Enums;
SmallVector<FileState> Files;		SmallVector<FileState> Files;
std::vector<std::string> ExpressionNames;		std::vector<std::string> ExpressionNames;
FileState *CurrentFile = nullptr;		FileState *CurrentFile = nullptr;
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	if (!SM.isInMainFile(Loc))
Files.back().GuardScanner = IncludeGuard::FileChanged;		Files.back().GuardScanner = IncludeGuard::FileChanged;
} else if (Reason == ExitFile) {		} else if (Reason == ExitFile) {
assert(CurrentFile->ConditionScopes == 0);		assert(CurrentFile->ConditionScopes == 0);
Files.pop_back();		Files.pop_back();
}		}
CurrentFile = &Files.back();		CurrentFile = &Files.back();
}		}

		bool MacroToEnumCallbacks::isInitializer(ArrayRef<Token> MacroTokens)
		{
		IntegralLiteralExpressionMatcher Matcher(MacroTokens);
		return Matcher.match();
		LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions inline variable made explicit for debugging LegalizeAdulthood: inline variable made explicit for debugging
		}


// Any defined but rejected macro is scanned for identifiers that		// Any defined but rejected macro is scanned for identifiers that
// are to be excluded as enums.		// are to be excluded as enums.
void MacroToEnumCallbacks::MacroDefined(const Token &MacroNameTok,		void MacroToEnumCallbacks::MacroDefined(const Token &MacroNameTok,
const MacroDirective *MD) {		const MacroDirective *MD) {
// Include guards are never candidates for becoming an enum.		// Include guards are never candidates for becoming an enum.
if (CurrentFile->GuardScanner == IncludeGuard::IfGuard) {		if (CurrentFile->GuardScanner == IncludeGuard::IfGuard) {
CurrentFile->GuardScanner = IncludeGuard::DefineGuard;		CurrentFile->GuardScanner = IncludeGuard::DefineGuard;
return;		return;
Show All 9 Lines	void MacroToEnumCallbacks::MacroDefined(const Token &MacroNameTok,
ArrayRef<Token> MacroTokens = Info->tokens();		ArrayRef<Token> MacroTokens = Info->tokens();
if (Info->isBuiltinMacro() \|\| MacroTokens.empty())		if (Info->isBuiltinMacro() \|\| MacroTokens.empty())
return;		return;
if (Info->isFunctionLike()) {		if (Info->isFunctionLike()) {
rememberExpressionTokens(MacroTokens);		rememberExpressionTokens(MacroTokens);
return;		return;
}		}

// Return Lit when +Lit, -Lit or ~Lit; otherwise return Unknown.		if (!isInitializer(MacroTokens))
Token Unknown;
Unknown.setKind(tok::TokenKind::unknown);
auto GetUnopArg = [Unknown](Token First, Token Second) {
return First.isOneOf(tok::TokenKind::minus, tok::TokenKind::plus,
tok::TokenKind::tilde)
? Second
: Unknown;
};

// It could just be a single token.
Token Tok = MacroTokens.front();

// It can be any arbitrary nesting of matched parentheses around
// +Lit, -Lit, ~Lit or Lit.
if (MacroTokens.size() > 2) {
// Strip off matching '(', ..., ')' token pairs.
size_t Begin = 0;
size_t End = MacroTokens.size() - 1;
assert(End >= 2U);
for (; Begin < MacroTokens.size() / 2; ++Begin, --End) {
if (!MacroTokens[Begin].is(tok::TokenKind::l_paren) \|\|
!MacroTokens[End].is(tok::TokenKind::r_paren))
break;
}
size_t Size = End >= Begin ? (End - Begin + 1U) : 0U;

// It was a single token inside matching parens.
if (Size == 1)
Tok = MacroTokens[Begin];
else if (Size == 2)
// It can be +Lit, -Lit or ~Lit.
Tok = GetUnopArg(MacroTokens[Begin], MacroTokens[End]);
else {
// Zero or too many tokens after we stripped matching parens.
rememberExpressionTokens(MacroTokens);
return;
}
} else if (MacroTokens.size() == 2) {
// It can be +Lit, -Lit, or ~Lit.
Tok = GetUnopArg(MacroTokens.front(), MacroTokens.back());
}

if (!Tok.isLiteral() \|\| isStringLiteral(Tok.getKind()) \|\|
!isIntegralConstant(Tok)) {
if (Tok.isAnyIdentifier())
rememberExpressionName(Tok);
return;		return;
}

if (!isConsecutiveMacro(MD))		if (!isConsecutiveMacro(MD))
newEnum();		newEnum();
Enums.back().emplace_back(MacroNameTok, MD);		Enums.back().emplace_back(MacroNameTok, MD);
rememberLastMacroLocation(MD);		rememberLastMacroLocation(MD);
}		}

// Any macro that is undefined removes all adjacent macros from consideration as		// Any macro that is undefined removes all adjacent macros from consideration as
▲ Show 20 Lines • Show All 192 Lines • Show Last 20 Lines

clang-tools-extra/docs/clang-tidy/checks/modernize-macro-to-enum.rst

	.. title:: clang-tidy - modernize-macro-to-enum			.. title:: clang-tidy - modernize-macro-to-enum

	modernize-macro-to-enum			modernize-macro-to-enum
	=======================			=======================

	Replaces groups of adjacent macros with an unscoped anonymous enum.			Replaces groups of adjacent macros with an unscoped anonymous enum.
	Using an unscoped anonymous enum ensures that everywhere the macro			Using an unscoped anonymous enum ensures that everywhere the macro
	token was used previously, the enumerator name may be safely used.			token was used previously, the enumerator name may be safely used.

	This check can be used to enforce the C++ core guideline `Enum.1:			This check can be used to enforce the C++ core guideline `Enum.1:
	Prefer enumerations over macros			Prefer enumerations over macros
	<https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#enum1-prefer-enumerations-over-macros>`_,			<https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#enum1-prefer-enumerations-over-macros>`_,
	within the constraints outlined below.			within the constraints outlined below.

	Potential macros for replacement must meet the following constraints:			Potential macros for replacement must meet the following constraints:

	- Macros must expand only to integral literal tokens or simple expressions			- Macros must expand only to integral literal tokens or expressions
	of literal tokens. The unary operators plus, minus and tilde are			of literal tokens. The expression may contain any of the unary
	recognized to allow for positive, negative and bitwise negated integers.			operators ``-``, ``+``, ``~`` or ``!``, any of the binary operators
	The above expressions may also be surrounded by matching pairs of			``,``, ``-``, ``+``, ``*``, ``/``, ``%``, ``&``, ``\|``, ``^``, ``<``,
	parentheses. More complicated integral constant expressions are not			``>``, ``<=``, ``>=``, ``==``, ``!=``, ``\|\|``, ``&&``, ``<<``, ``>>``
	recognized by this check.			or ``<=>``, the ternary operator ``?:`` and its
				aaron.ballmanUnsubmitted Done Reply Inline Actions Maybe we should also list the binary `?:` GNU extension and comma expressions? aaron.ballman: Maybe we should also list the binary `?:` GNU extension and comma expressions?
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions Yes, I need to update the docs on that and also call out the potential false positives explicitly. LegalizeAdulthood: Yes, I need to update the docs on that and also call out the potential false positives…
				`GNU extension <https://gcc.gnu.org/onlinedocs/gcc/Conditionals.html>`_.
				Parenthesized expressions are also recognized. This recognizes
				most valid expressions. In particular, expressions with the
				``sizeof`` operator are not recognized.
	- Macros must be defined on sequential source file lines, or with			- Macros must be defined on sequential source file lines, or with
	only comment lines in between macro definitions.			only comment lines in between macro definitions.
	- Macros must all be defined in the same source file.			- Macros must all be defined in the same source file.
	- Macros must not be defined within a conditional compilation block.			- Macros must not be defined within a conditional compilation block.
	(Conditional include guards are exempt from this constraint.)			(Conditional include guards are exempt from this constraint.)
	- Macros must not be defined adjacent to other preprocessor directives.			- Macros must not be defined adjacent to other preprocessor directives.
	- Macros must not be used in any conditional preprocessing directive.			- Macros must not be used in any conditional preprocessing directive.
				- Macros must not be used as arguments to other macros.
	- Macros must not be undefined.			- Macros must not be undefined.
				- Macros must be defined at the top-level, not inside any declaration or
				definition.

	Each cluster of macros meeting the above constraints is presumed to			Each cluster of macros meeting the above constraints is presumed to
	be a set of values suitable for replacement by an anonymous enum.			be a set of values suitable for replacement by an anonymous enum.
	From there, a developer can give the anonymous enum a name and			From there, a developer can give the anonymous enum a name and
	continue refactoring to a scoped enum if desired. Comments on the			continue refactoring to a scoped enum if desired. Comments on the
	same line as a macro definition or between subsequent macro definitions			same line as a macro definition or between subsequent macro definitions
	are preserved in the output. No formatting is assumed in the provided			are preserved in the output. No formatting is assumed in the provided
	replacements, although clang-tidy can optionally format all fixes.			replacements, although clang-tidy can optionally format all fixes.

				.. warning::

				Initializing expressions are assumed to be valid initializers for
				an enum. C requires that enum values fit into an ``int``, but
				this may not be the case for some accepted constant expressions.
				For instance ``1 << 40`` will not fit into an ``int`` when the size of
				an ``int`` is 32 bits.

	Examples:			Examples:

	.. code-block:: c++			.. code-block:: c++

	#define RED 0xFF0000			#define RED 0xFF0000
	#define GREEN 0x00FF00			#define GREEN 0x00FF00
	#define BLUE 0x0000FF			#define BLUE 0x0000FF

	Show All 23 Lines

clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp

	// RUN: %check_clang_tidy -std=c++14-or-later %s modernize-macro-to-enum %t -- -- -I%S/Inputs/modernize-macro-to-enum -fno-delayed-template-parsing			// RUN: %check_clang_tidy -std=c++14-or-later %s modernize-macro-to-enum %t -- -- -I%S/Inputs/modernize-macro-to-enum -fno-delayed-template-parsing
	// C++14 or later required for binary literals.			// C++14 or later required for binary literals.

	#if 1			#if 1
	#include "modernize-macro-to-enum.h"			#include "modernize-macro-to-enum.h"

	// These macros are skipped due to being inside a conditional compilation block.			// These macros are skipped due to being inside a conditional compilation block.
	#define GOO_RED 1			#define GOO_RED 1
	#define GOO_GREEN 2			#define GOO_GREEN 2
	#define GOO_BLUE 3			#define GOO_BLUE 3

	#endif			#endif

				// Macros expanding to expressions involving only literals are converted.
				#define EXPR1 1 - 1
				#define EXPR2 1 + 1
				#define EXPR3 1 * 1
				#define EXPR4 1 / 1
				#define EXPR5 1 \| 1
				#define EXPR6 1 & 1
				#define EXPR7 1 << 1
				#define EXPR8 1 >> 1
				#define EXPR9 1 % 2
				#define EXPR10 1 ^ 1
				#define EXPR11 (1 + (2))
				#define EXPR12 ((1) + (2 + 0) + (1 * 1) + (1 / 1) + (1 \| 1 ) + (1 & 1) + (1 << 1) + (1 >> 1) + (1 % 2) + (1 ^ 1))
				aaron.ballmanUnsubmitted Done Reply Inline Actions Other interesting tests I'd expect we could convert into an enum (at least theoretically): #define A 12 + +1 #define B 12 - -1 #define C (1, 2, 3) #define D 100 ? : 8 #define E 100 ? 100 : 8 #define F 'f' #define G "foo"[0] #define H 1 && 2 #define I 1 \|\| 2 aaron.ballman: Other interesting tests I'd expect we could convert into an enum (at least theoretically): ```…
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions Most of these (except comma operator and string subscript, see my comments earlier) are covered in the unit test for the matcher. I'll add tests for these: 12 + +1 12 - -1 100 ? : 8 LegalizeAdulthood: Most of these (except comma operator and string subscript, see my comments earlier) are covered…
				// CHECK-MESSAGES: :[[@LINE-12]]:1: warning: replace macro with enum [modernize-macro-to-enum]
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR1' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR2' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR3' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR4' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR5' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR6' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR7' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR8' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR9' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR10' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR11' defines an integral constant; prefer an enum instead
				// CHECK-MESSAGES: :[[@LINE-13]]:9: warning: macro 'EXPR12' defines an integral constant; prefer an enum instead
				// CHECK-FIXES: enum {
				// CHECK-FIXES-NEXT: EXPR1 = 1 - 1,
				// CHECK-FIXES-NEXT: EXPR2 = 1 + 1,
				// CHECK-FIXES-NEXT: EXPR3 = 1 * 1,
				// CHECK-FIXES-NEXT: EXPR4 = 1 / 1,
				// CHECK-FIXES-NEXT: EXPR5 = 1 \| 1,
				// CHECK-FIXES-NEXT: EXPR6 = 1 & 1,
				// CHECK-FIXES-NEXT: EXPR7 = 1 << 1,
				// CHECK-FIXES-NEXT: EXPR8 = 1 >> 1,
				// CHECK-FIXES-NEXT: EXPR9 = 1 % 2,
				// CHECK-FIXES-NEXT: EXPR10 = 1 ^ 1,
				// CHECK-FIXES-NEXT: EXPR11 = (1 + (2)),
				// CHECK-FIXES-NEXT: EXPR12 = ((1) + (2 + 0) + (1 * 1) + (1 / 1) + (1 \| 1 ) + (1 & 1) + (1 << 1) + (1 >> 1) + (1 % 2) + (1 ^ 1))
				// CHECK-FIXES-NEXT: };

	#define RED 0xFF0000			#define RED 0xFF0000
	#define GREEN 0x00FF00			#define GREEN 0x00FF00
	#define BLUE 0x0000FF			#define BLUE 0x0000FF
	// CHECK-MESSAGES: :[[@LINE-3]]:1: warning: replace macro with enum [modernize-macro-to-enum]			// CHECK-MESSAGES: :[[@LINE-3]]:1: warning: replace macro with enum [modernize-macro-to-enum]
	// CHECK-MESSAGES: :[[@LINE-4]]:9: warning: macro 'RED' defines an integral constant; prefer an enum instead			// CHECK-MESSAGES: :[[@LINE-4]]:9: warning: macro 'RED' defines an integral constant; prefer an enum instead
	// CHECK-MESSAGES: :[[@LINE-4]]:9: warning: macro 'GREEN' defines an integral constant; prefer an enum instead			// CHECK-MESSAGES: :[[@LINE-4]]:9: warning: macro 'GREEN' defines an integral constant; prefer an enum instead
	// CHECK-MESSAGES: :[[@LINE-4]]:9: warning: macro 'BLUE' defines an integral constant; prefer an enum instead			// CHECK-MESSAGES: :[[@LINE-4]]:9: warning: macro 'BLUE' defines an integral constant; prefer an enum instead
	// CHECK-FIXES: enum {			// CHECK-FIXES: enum {
	▲ Show 20 Lines • Show All 302 Lines • ▼ Show 20 Lines
	#define INSIDE8 8			#define INSIDE8 8
	};			};

	// Ignore macros defined inside a template function definition.			// Ignore macros defined inside a template function definition.
	template <int N>			template <int N>
	#define INSIDE9 9			#define INSIDE9 9
	bool fn()			bool fn()
	{			{
	#define INSIDE10 10			#define INSIDE10 10
	return INSIDE9 > 1 \|\| INSIDE10 < N;			return INSIDE9 > 1 \|\| INSIDE10 < N;
	}			}

	// Ignore macros defined inside a variable declaration.			// Ignore macros defined inside a variable declaration.
	extern int			extern int
	#define INSIDE11 11			#define INSIDE11 11
	v;			v;

	▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

clang-tools-extra/unittests/clang-tidy/CMakeLists.txt

Show All 19 Lines	add_extra_unittest(ClangTidyTests
AddConstTest.cpp		AddConstTest.cpp
ClangTidyDiagnosticConsumerTest.cpp		ClangTidyDiagnosticConsumerTest.cpp
ClangTidyOptionsTest.cpp		ClangTidyOptionsTest.cpp
DeclRefExprUtilsTest.cpp		DeclRefExprUtilsTest.cpp
IncludeInserterTest.cpp		IncludeInserterTest.cpp
GlobListTest.cpp		GlobListTest.cpp
GoogleModuleTest.cpp		GoogleModuleTest.cpp
LLVMModuleTest.cpp		LLVMModuleTest.cpp
		ModernizeModuleTest.cpp
NamespaceAliaserTest.cpp		NamespaceAliaserTest.cpp
ObjCModuleTest.cpp		ObjCModuleTest.cpp
OptionsProviderTest.cpp		OptionsProviderTest.cpp
OverlappingReplacementsTest.cpp		OverlappingReplacementsTest.cpp
UsingInserterTest.cpp		UsingInserterTest.cpp
ReadabilityModuleTest.cpp		ReadabilityModuleTest.cpp
TransformerClangTidyCheckTest.cpp		TransformerClangTidyCheckTest.cpp
)		)
Show All 12 Lines	clang_target_link_libraries(ClangTidyTests
clangTransformer		clangTransformer
)		)
target_link_libraries(ClangTidyTests		target_link_libraries(ClangTidyTests
PRIVATE		PRIVATE
clangTidy		clangTidy
clangTidyAndroidModule		clangTidyAndroidModule
clangTidyGoogleModule		clangTidyGoogleModule
clangTidyLLVMModule		clangTidyLLVMModule
		clangTidyModernizeModule
clangTidyObjCModule		clangTidyObjCModule
clangTidyReadabilityModule		clangTidyReadabilityModule
clangTidyUtils		clangTidyUtils
LLVMTestingSupport		LLVMTestingSupport
)		)

clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

This file was added.

				//===---- ModernizeModuleTest.cpp - clang-tidy ----------------------------===//
				//
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions Needs a file header LegalizeAdulthood: Needs a file header
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				#include "ClangTidyTest.h"
				#include "modernize/IntegralLiteralExpressionMatcher.h"
				#include "clang/Lex/Lexer.h"
				#include "gtest/gtest.h"

				#include <cstring>
				#include <iterator>
				#include <string>
				#include <vector>

				namespace clang {
				namespace tidy {
				namespace test {

				static std::vector<Token> tokenify(const char *Text) {
				LangOptions LangOpts;
				std::vector<std::string> Includes;
				LangOptions::setLangDefaults(LangOpts, Language::CXX, llvm::Triple(),
				Includes, LangStandard::lang_cxx20);
				Lexer Lex(SourceLocation{}, LangOpts, Text, Text, Text + std::strlen(Text));
				std::vector<Token> Tokens;
				bool End = false;
				while (!End) {
				Token Tok;
				End = Lex.LexFromRawLexer(Tok);
				Tokens.push_back(Tok);
				}

				return Tokens;
				}

				static bool matchText(const char *Text) {
				std::vector<Token> Tokens{tokenify(Text)};
				modernize::IntegralLiteralExpressionMatcher Matcher(Tokens);

				return Matcher.match();
				}

				namespace {

				struct Param {
				bool Matched;
				const char *Text;
				};

				class MatcherTest : public ::testing::TestWithParam<Param> {};

				} // namespace

				static const Param Params[] = {
				// Accept integral literals.
				{true, "1"},
				{true, "0177"},
				{true, "0xdeadbeef"},
				{true, "0b1011"},
				{true, "'c'"},
				// Reject non-integral literals.
				{false, "1.23"},
				{false, "0x1p3"},
				{false, R"("string")"},
				aaron.ballmanUnsubmitted Done Reply Inline Actions 12i .0 aaron.ballman: ``` 12i .0 ```
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions `.0` is already covered by the case `1.23`. I'm not home brewing tokenization, but using the Lexer to do that. `12i` I need to investigate to find out what the Lexer does. LegalizeAdulthood: `.0` is already covered by the case `1.23`. I'm not home brewing tokenization, but using the…
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions OK, so `12i` turns into `numeric_constant` token, so I've added test cases to exclude those and enhanced the matcher. Essentially that's a bug in the existing implementation that `12i` wasn't rejected outright. LegalizeAdulthood: OK, so `12i` turns into `numeric_constant` token, so I've added test cases to exclude those and…
				{false, "1i"},

				// Accept literals with these unary operators.
				{true, "-1"},
				{true, "+1"},
				{true, "~1"},
				{true, "!1"},
				// Reject invalid unary operators.
				{false, "1-"},
				{false, "1+"},
				{false, "1~"},
				{false, "1!"},

				// Accept valid binary operators.
				{true, "1+1"},
				{true, "1-1"},
				{true, "1*1"},
				{true, "1/1"},
				{true, "1%2"},
				{true, "1<<1"},
				{true, "1>>1"},
				{true, "1<=>1"},
				{true, "1<1"},
				{true, "1>1"},
				{true, "1<=1"},
				{true, "1>=1"},
				{true, "1==1"},
				{true, "1!=1"},
				{true, "1&1"},
				{true, "1^1"},
				{true, "1\|1"},
				{true, "1&&1"},
				{true, "1\|\|1"},
				aaron.ballmanUnsubmitted Done Reply Inline Actions 100 ? : 10 1, 2 aaron.ballman: ``` 100 ? : 10 1, 2 ```
				{true, "1+ +1"}, // A space is needed to avoid being tokenized as ++ or --.
				{true, "1- -1"},
				{true, "1,1"},
				// Reject invalid binary operators.
				{false, "1+"},
				{false, "1-"},
				{false, "1*"},
				{false, "1/"},
				{false, "1%"},
				{false, "1<<"},
				{false, "1>>"},
				{false, "1<=>"},
				{false, "1<"},
				{false, "1>"},
				{false, "1<="},
				{false, "1>="},
				{false, "1=="},
				{false, "1!="},
				{false, "1&"},
				{false, "1^"},
				{false, "1\|"},
				{false, "1&&"},
				{false, "1\|\|"},
				{false, "1,"},
				{false, ",1"},

				// Accept valid ternary operators.
				{true, "1?1:1"},
				{true, "1?:1"}, // A gcc extension treats x ? : y as x ? x : y.
				// Reject invalid ternary operators.
				{false, "?"},
				{false, "?1"},
				{false, "?:"},
				{false, "?:1"},
				{false, "?1:"},
				aaron.ballmanUnsubmitted Done Reply Inline Actions This one is valid aaron.ballman: This one is valid
				{false, "?1:1"},
				{false, "1?"},
				{false, "1?1"},
				{false, "1?:"},
				{false, "1?1:"},

				// Accept parenthesized expressions.
				{true, "(1)"},
				{true, "((+1))"},
				{true, "((+(1)))"},
				{true, "(-1)"},
				{true, "-(1)"},
				{true, "(+1)"},
				{true, "((+1))"},
				{true, "+(1)"},
				{true, "(~1)"},
				{true, "~(1)"},
				{true, "(!1)"},
				{true, "!(1)"},
				{true, "(1+1)"},
				{true, "(1-1)"},
				{true, "(1*1)"},
				{true, "(1/1)"},
				{true, "(1%2)"},
				{true, "(1<<1)"},
				{true, "(1>>1)"},
				{true, "(1<=>1)"},
				{true, "(1<1)"},
				{true, "(1>1)"},
				{true, "(1<=1)"},
				{true, "(1>=1)"},
				{true, "(1==1)"},
				{true, "(1!=1)"},
				{true, "(1&1)"},
				{true, "(1^1)"},
				{true, "(1\|1)"},
				{true, "(1&&1)"},
				{true, "(1\|\|1)"},
				{true, "(1?1:1)"},

				// Accept more complicated "chained" expressions.
				{true, "1+1+1"},
				{true, "1+1+1+1"},
				{true, "1+1+1+1+1"},
				{true, "111"},
				{true, "111*1"},
				{true, "11111"},
				{true, "1<<1<<1"},
				{true, "4U>>1>>1"},
				{true, "1<1<1"},
				{true, "1>1>1"},
				{true, "1<=1<=1"},
				{true, "1>=1>=1"},
				{true, "1==1==1"},
				{true, "1!=1!=1"},
				{true, "1&1&1"},
				{true, "1^1^1"},
				{true, "1\|1\|1"},
				{true, "1&&1&&1"},
				{true, "1\|\|1\|\|1"},
				{true, "1,1,1"},
				};

				TEST_P(MatcherTest, MatchResult) {
				EXPECT_TRUE(matchText(GetParam().Text) == GetParam().Matched);
				}

				INSTANTIATE_TEST_SUITE_P(TokenExpressionParserTests, MatcherTest,
				::testing::ValuesIn(Params));

				} // namespace test
				} // namespace tidy
				} // namespace clang

				std::ostream &operator<<(std::ostream &Str,
				const clang::tidy::test::Param &Value) {
				return Str << "Matched: " << std::boolalpha << Value.Matched << ", Text: '"
				<< Value.Text << "'";
				}
				aaronpuchertUnsubmitted Done Reply Inline Actions Seems to have caused a build failure: FAILED: tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o /home/buildbots/clang.11.0.0/bin/clang++ --gcc-toolchain=/opt/rh/devtoolset-7/root/usr -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/clang/tools/extra/unittests/clang-tidy -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include -Itools/clang/include -Iinclude -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clang-tidy -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googletest/include -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googlemock/include -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG -Wno-variadic-macros -Wno-gnu-zero-variadic-macro-arguments -fno-exceptions -fno-rtti -UNDEBUG -Wno-suggest-override -std=c++14 -MD -MT tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o -MF tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o.d -o tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o -c /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210:15: error: unused function 'operator<<' [-Werror,-Wunused-function] std::ostream &operator<<(std::ostream &Str, ^ 1 error generated. aaronpuchert: Seems to have caused a [build failure](https://lab.llvm.
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions Simon Pilgrim fixed it, but I don't understand why clang calls this function unused. When the test fails, gtest uses this function to pretty print the parameter. I'm rebuilding with a forced test failure to validate. LegalizeAdulthood: Simon Pilgrim fixed it, but I don't understand why clang calls this function unused. When the…
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions Yes, without this function the failing test prints results like this: [ RUN ] TokenExpressionParserTests/MatcherTest.MatchResult/123 D:\legalize\llvm\llvm-project\clang-tools-extra\unittests\clang-tidy\ModernizeModuleTest.cpp(200): error: Value of: matchText(GetParam().Text) == GetParam().Matched Actual: false Expected: true [ FAILED ] TokenExpressionParserTests/MatcherTest.MatchResult/123, where GetParam() = 16-byte object <00-00 00-00 00-00 00-00 40-EC 3B-D6 F6-7F 00-00> (1 ms) ....which isn't particularly useful. So how do we include pretty printers for tests without clang erroneously flagging them as unused? LegalizeAdulthood: Yes, without this function the failing test prints results like this: ``` [ RUN ]…
				aaronpuchertUnsubmitted Done Reply Inline Actions What got me wondering: this definition is last in the file, and there is no prior declaration of this function. How can there be any uses of it? We're not in class scope, so all prior uses of `operator<<` or perhaps `PrintTo` must have been resolved to some other function already. Or am I missing something? aaronpuchert: What got me wondering: this definition is last in the file, and there is no prior declaration…
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions Could have been an MSVC-ism, which is what I develop with. It was most definitely being used while I was working on the test cases. LegalizeAdulthood: Could have been an MSVC-ism, which is what I develop with. It was most definitely being used…
				LegalizeAdulthoodAuthorUnsubmitted Done Reply Inline Actions It seems what other tests do is define a friend function in the parameter class. I'm going to push that and see if that is accepted. LegalizeAdulthood: It seems what other tests do is define a friend function in the parameter class. I'm going to…

This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Support expressions of literals in modernize-macro-to-enumClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 429395

clang-tools-extra/clang-tidy/modernize/CMakeLists.txt

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h

clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp

clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp

clang-tools-extra/docs/clang-tidy/checks/modernize-macro-to-enum.rst

clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp

clang-tools-extra/unittests/clang-tidy/CMakeLists.txt

clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

[clang-tidy] Support expressions of literals in modernize-macro-to-enum
ClosedPublic