This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Add readability-operators-representation check

Authored by PiotrZSL on Feb 21 2023, 2:46 PM.



Check helps enforce consistent token representation for binary, unary and
overloaded operators in C++ code. The check supports both traditional and
alternative representations of operators.

Diff Detail

Event Timeline

PiotrZSL created this revision.Feb 21 2023, 2:46 PM
Herald added a project: Restricted Project. · View Herald Transcript

TODO: Tests still need to be extended to cover conversion of all operators in both direction.

PiotrZSL updated this revision to Diff 499306.Feb 21 2023, 2:53 PM

list.rst correction

Actually is exactly about this :-)


cstring, please.


Please don't use auto unless type is spelled explicidly in same statement or iterator.




Excessive newline.

PiotrZSL marked 4 inline comments as done.Feb 22 2023, 12:13 PM
PiotrZSL updated this revision to Diff 499622.Feb 22 2023, 12:25 PM

Update tests + change config

This check replaces D31308 and D107294.

PiotrZSL updated this revision to Diff 499626.Feb 22 2023, 12:42 PM

Style fixes

PiotrZSL published this revision for review.Feb 22 2023, 2:55 PM

Ready for review

Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2023, 2:55 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
PiotrZSL updated this revision to Diff 503364.Mar 8 2023, 7:31 AM

Ping, Rebase

carlosgalvezp added inline comments.Mar 18 2023, 10:02 AM

Is there any reason why this distinction is needed? The usage of the operator happens in client code - it's unlikely one would want different style depending on whether the operator is overloaded or not?

bool x = some_bool and some_other_bool;
bool y = some_object && some_other_object;

Personally I find this to be a matter of style, and arguments could be found for either style. As such I don't think this check should promote the style preference of the author of the check.

It's like having a written discussion in the clang-format option ColumnLimit explaining why N characters is a better choice than M characters. In the end this is a decision each project should take, and the check should not have a bias towards a preferred choice.

carlosgalvezp added inline comments.Mar 18 2023, 10:41 AM

FYI @njames93 , do you have an opinion on the matter as Code Owner?

PiotrZSL added inline comments.Mar 18 2023, 10:46 AM

Consider code like this:

template<typename T>
void archive(T&& a, std::optional<int>& v)
    a & v;


bool test(unsigned value, unsigned mask)
    return value & mask;

Those are 2 different use-cases, probably you don't want to use bit_and, but you could use bit_and for value & mask.


Following this way of thinking none of value is correct. Always will be group that is for it or against it. Initially this check had to be only to enforce alternative tokens but then made it configurable and change name of check. This check is for "readability" and alternative tokens are more readable. If someone don't like them, their choose, they can change config to their needs.

carlosgalvezp added inline comments.Mar 19 2023, 12:30 AM

Good point, thanks for the clarification!


alternative tokens are more readable

I don't believe this is an objective, universal truth - I think it's very subjective, and you will find people who disagrees.

Take for example the "readability-identifier-naming" check. There's many different conventions supported there, there's no "correct" one and "incorrect" one. Imagine if the author wrote a 100+ lines dissertation about why Hungarian notation is better than the others, and encourage people to choose that.

if someone don't like them, their choose, they can change config to their needs.

Yes, this is exactly what I would expect from this check. My point is that this lengthy discussion here encouraging choosing one style over the other is out of place, cluttering the documentation with content that is not relevant to understand what the check does and how to use it. The check documentation should be neutral and just present the available options. This is consistent with all other checks.

The purpose of this check is not to transform code into ATR style, but rather to ensure consistency in a project, whichever style is wanted.


It's unclear what the default value is for the other operators. From the code I see that ATR is the default - in that case I think the documentation should either a) provide the full list of defaults here, or b) simply mention "ATR style is the default". As a user I would prefer the former, so I know what keywords I am allowed to type in here, without having to go and consult cppreference.

Actually, now that I think about it, wouldn't it be better to implement this as an enum, similar to the readability-identifier-naming checks? Is there a use case for having and;||;not, i.e. a mix of style within the same category (e.g. BinaryOperators)?

In that case, it could look something like:

readability-operators-representation.BinaryOperatorsStyle = Traditional
readability-operators-representation.BinaryOperatorsStyle = Alternative

This would be easier to use from a user perspective, and IMHO would lead to a more consistent codebase, where one style or the other is chosen for all operators in each category.

PiotrZSL planned changes to this revision.Mar 19 2023, 3:08 AM

Update documentation, and config examples, add "Traditional Tokens Representation" section.


I choose this style just because with single option you can define whatever you like.
You can easily be less or more strict, enforce for example alternative tokens for some, and enforce traditional for other.

Entire section "Alternative Token Representation" is here so you wouldn't need to go to cppreference, there is an mapping.

And note that default config don't enforce alternative tokens for all, just for 3 most common.
Not everyone know about alternative token existence. And cppreference don't describe why they exist. This is why this description with examples were added, about pros and some cons. Not only engineers read checks documentations , quality/product managers do that also.
And most of checks descriptions basically don't provide any information that would answer a question: "Why I should enable this check, what I will gain".

I agree, that some additional paragraph about example configurations (less, more strict) and maybe some clarification about options could be added. And some clarification why options are split into two with example, I will think about this.

As for "Alternative Token Representation" if you wan't I can extract from it also "Traditional Token Representation" and put there pros and cons. And move mapping table in front of both of them.

PiotrZSL updated this revision to Diff 506398.Mar 19 2023, 8:30 AM

Update documentation, changed default check configuration.

PiotrZSL marked 2 inline comments as done.Mar 19 2023, 8:31 AM

Looks really good, thank you! I have only very minor comments, mostly style nits and suggestions for improved readability.


As per LLVM guidelines this should be static, outside the anonymous namespace.


Nit: would be good to add a one-line comment before each of the code blocks to describe at high-level what they do, to get a better birds-eye view of the code structure in this large function. At first sight it took me a while to spot the difference and understand why it's not exactly duplicated code.

Alternatively creating a function for each of the 3 addMatcher lines could be self-documenting and make the "registerMatches" smaller.


Nit: could get a bit more descriptive name, like "binary_op"


Is this condition intended to be negated too? Also, would it make sense to move this early return to the beginning of the function?


Nit: would be good to spell this out for better readability.


I believe this include is not in use in this header?


Really appreciate the changes to the documentation, as a user it's much more clear how the check behaves and how to configure it, in a self-contained way!


Nit: keep indentation to 2 spaces as is standard in the rest of the repo (including tests).


Nit: indent comments 2 spaces so they align with the code

PiotrZSL marked 8 inline comments as done.Mar 25 2023, 5:38 AM
PiotrZSL added inline comments.

Good catch.

PiotrZSL updated this revision to Diff 508303.Mar 25 2023, 5:38 AM
PiotrZSL marked an inline comment as done.

Rebase + Fix comments

Eugene.Zelenko added inline comments.Mar 25 2023, 6:59 AM

Please omit Check.



PiotrZSL marked 2 inline comments as done.Mar 25 2023, 8:18 AM
PiotrZSL updated this revision to Diff 508310.Mar 25 2023, 8:21 AM

Fixes in documentation & clang-format

LegalizeAdulthood resigned from this revision.Mar 29 2023, 8:04 AM
carlosgalvezp accepted this revision.Mar 31 2023, 6:09 AM

LGTM, thanks for the check! Please fix the missing space comment before landing.


Missing space

This revision is now accepted and ready to land.Mar 31 2023, 6:09 AM

Thank you, by any chance would you be able to look into other reviews ?

PiotrZSL marked an inline comment as done.Mar 31 2023, 8:29 AM
cjdb added a subscriber: cjdb.Mar 31 2023, 9:48 AM

Thanks so much for seeing this through; I'm unusually looking forward to rebuilding LLVM this weekend!


This is a great solution for ranges pipelines, which I've struggled to work out a good automated policy on for years.

Thank you, by any chance would you be able to look into other reviews ?

Doing my best to catch up, I have a long list of patches to review! Unfortunately very limited availability at the moment, mostly only weekends. I tend to prioritize smaller patches that fix things since they are easier/faster to review, but I'll definitely come back to your other larger patches when I find some time :)