This is an archive of the discontinued LLVM Phabricator instance.

[Diagnostics][NFC] Fix -Wlogical-op-parentheses warning inconsistency for const and constexpr values
ClosedPublic

Authored by hazohelet on Jan 2 2023, 4:40 PM.

Details

Summary

When using the && operator within a || operator, both Clang and GCC produce a warning for potentially confusing operator precedence. However, Clang avoids this warning for certain patterns, such as a && b || 0 or a || b && 1, where the operator precedence of && and || does not change the result.

However, this behavior appears inconsistent when using the const or constexpr qualifiers. For example:

bool t = true;
bool tt = true || false && t; // Warning: '&&' within '||' [-Wlogical-op-parentheses]
const bool t = true;
bool tt = true || false && t; // No warning
const bool t = false;
bool tt = true || false && t; // Warning: '&&' within '||' [-Wlogical-op-parentheses]

The second example does not produce a warning because true || false && t matches the a || b && 1 pattern, while the third one does not match any of them.

This behavior can lead to the lack of warnings for complicated constexpr expressions. Clang should only suppress this warning when literal values are placed in the place of t in the examples above.

This patch adds the literal-or-not check to fix the inconsistent warnings for && within || when using const or constexpr.

Diff Detail

Event Timeline

hazohelet created this revision.Jan 2 2023, 4:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 2 2023, 4:40 PM
hazohelet requested review of this revision.Jan 2 2023, 4:40 PM

This behavior can lead to the lack of warnings for complicated constexpr expressions.

Are those bugs, though? If the parentheses wouldn't change the outcome, due to the RHS being constant, maybe these undiagnosed cases aren't reflective of source code bugs/coding mistakes the user needs to change?

As you point out, enhancement may be more accurate than bug fix.
There are rare cases where enabling a warning for missing parentheses in constexpr logical expressions can be helpful, I think. For example, consider the following code:

constexpr A = ...;
constexpr B = ...;
constexpr C = ...;

static_assert(!(A && B || C));

In this case, the static assertion will only be successful if (A && B || C) evaluates to false, which is equivalent to the following combinations of values for A, B, and C:

(A, B, C) = (T, F, F) (F, T, F) (F, F, F)

Note that T means true and F means false. Here, C is always false, so A && B || C matches the case of a && b || 0. Thus, the warning is not issued before this patch.
If the programmer is not careful and assumes that (A && B || C) is equivalent to (A && (B || C)), then they expect the values of A, B, and C to also include the following combinations:

(A, B, C) = (F, T, T) (F, F, T)

This would require the programmer to consider additional, unnecessary combinations after the successful static assertion.

Enabling a warning for missing parentheses in this scenario could help prevent the programmer from making this mistake.

As you point out, enhancement may be more accurate than bug fix.
There are rare cases where enabling a warning for missing parentheses in constexpr logical expressions can be helpful, I think. For example, consider the following code:

constexpr A = ...;
constexpr B = ...;
constexpr C = ...;

static_assert(!(A && B || C));

In this case, the static assertion will only be successful if (A && B || C) evaluates to false, which is equivalent to the following combinations of values for A, B, and C:

(A, B, C) = (T, F, F) (F, T, F) (F, F, F)

Note that T means true and F means false. Here, C is always false, so A && B || C matches the case of a && b || 0. Thus, the warning is not issued before this patch.
If the programmer is not careful and assumes that (A && B || C) is equivalent to (A && (B || C)), then they expect the values of A, B, and C to also include the following combinations:

(A, B, C) = (F, T, T) (F, F, T)

This would require the programmer to consider additional, unnecessary combinations after the successful static assertion.

Enabling a warning for missing parentheses in this scenario could help prevent the programmer from making this mistake.

Fair enough. I guess the same argument applies to the non-static case too. I think it was original mostly motivated by people using && "This is the reason for the assert" without parentheses around the LHS & that was generally fine. So, so long as a string literal still does the suppression, it's probably fine.

hazohelet updated this revision to Diff 486776.Jan 6 2023, 1:53 AM

This update limits the warning suppression case to string literals only, and delete no longer necessary functions.

hazohelet updated this revision to Diff 486793.Jan 6 2023, 3:15 AM

add up the former 2 commits into 1

So, so long as a string literal still does the suppression, it's probably fine.

I agree with you. I now also think that integer literals _should not_ do the warning suppression because programmers are sometimes unsure of the expansion result of predefined macros in the compiled environment.
I updated the differential. I am new to Phabricator and made some unnecessary operations. Sorry for bothering you.

The risk now is that this might significantly regress/add new findings for this warning that may not be sufficiently bug-finding to be worth immediate cleanup, causing users to have to choose between extensive lower-value cleanup and disabling the warning entirely.

Have you/could you run this over a significant codebase to see what sort of new findings the modified warning finds, to see if they're high quality bug finding, or mostly noise/check for whether this starts to detect certain idioms we want to handle differently?

It might be hard to find a candidate codebase that isn't already warning-clean with GCC (at least Clang/LLVM wouldn't be a good candidate because of this) & maybe that's sufficient justification to not worry too much about this outcome...

@aaron.ballman curious what your take on this might be

I have yet to do thorough checks using this patched clang to build significant code bases.
It will likely take quite a bit of time as I am not used to editing build tool files.

Instead, I used grep to find potentially newly-warned codes.
The grep command is shown below. It is intended to match C/C++ codes with both && and || within which 0, 1, or 2 matching parentheses exist in a single line.
grep -r --include=\*.cpp --include=\*.cc --include=\*.c -e "&&[^()]*||" -e "||[^()]*&&" -e "&&[^()]*([^()]*)[^()]*||" -e "||[^()]*([^()]*)[^()]*&&" -e "&&[^()]*([^()]*([^()]*)[^()]*)[^()]*||" -e "||[^()]*([^()]*([^()]*)[^()]*)[^()]*&&"

I applied this grep command to the following popular GitHub repositories NVIDIA/TensorRT, bulletphysics/bullet3, apple/foundationdb, grpc/grpc, microsoft/lightgbm, rui314/mold, oneapi-src/oneTBB, SerenityOS/serenity, tensorflow/tensorflow.
I found the 7 examples of missing parentheses at && within || in the matched strings (many of the matchings exist in preprocessor conditionals, which is out of the scope of this patch)
They are listed at the end of this comment.
The last case in tensorflow is an assert(a || b && "reason for the assert") known idiom and is handled correctly.
Because the newly-warned constants are compile-time constants not dependent on function arguments, the other 6 cases will also get warnings from the clang before this patch.

It suggests that this patch does not generate extensive new warnings in existent codebases.

oneTBB:
https://github.com/oneapi-src/oneTBB/blob/e6e493f96ec8b7e2e2b4d048ed49356eb54ec2a0/examples/common/gui/xvideo.cpp#L359
https://github.com/oneapi-src/oneTBB/blob/e6e493f96ec8b7e2e2b4d048ed49356eb54ec2a0/src/tbbmalloc/frontend.cpp#L1266
tensorflow:
https://github.com/tensorflow/tensorflow/blob/fb2c3b1e3140af73f949981d8428379cbb28228b/tensorflow/compiler/tf2tensorrt/convert/convert_nodes_test.cc#L4130
https://github.com/tensorflow/tensorflow/blob/fb2c3b1e3140af73f949981d8428379cbb28228b/tensorflow/compiler/tf2tensorrt/convert/convert_nodes_test.cc#L4074
https://github.com/tensorflow/tensorflow/blob/fb2c3b1e3140af73f949981d8428379cbb28228b/tensorflow/compiler/tf2tensorrt/convert/convert_nodes_test.cc#L9456
https://github.com/tensorflow/tensorflow/blob/fb2c3b1e3140af73f949981d8428379cbb28228b/tensorflow/compiler/jit/deadness_analysis.cc#L1423
https://github.com/tensorflow/tensorflow/blob/fb2c3b1e3140af73f949981d8428379cbb28228b/tensorflow/core/ir/tf_op_wrapper.cc#L25

I have yet to do thorough checks using this patched clang to build significant code bases.
It will likely take quite a bit of time as I am not used to editing build tool files.

FWIW a lot of build systems support setting CXXFLAGS/CFLAGS before invoking the build system/build generator (cmake, for instance) and respects those - so might be relatively easy to add a new warning flag to the build (& CXX/CC to set the compiler to point to your local build with the patch/changes)

Instead, I used grep to find potentially newly-warned codes.
The grep command is shown below. It is intended to match C/C++ codes with both && and || within which 0, 1, or 2 matching parentheses exist in a single line.
grep -r --include=\*.cpp --include=\*.cc --include=\*.c -e "&&[^()]*||" -e "||[^()]*&&" -e "&&[^()]*([^()]*)[^()]*||" -e "||[^()]*([^()]*)[^()]*&&" -e "&&[^()]*([^()]*([^()]*)[^()]*)[^()]*||" -e "||[^()]*([^()]*([^()]*)[^()]*)[^()]*&&"

I applied this grep command to the following popular GitHub repositories NVIDIA/TensorRT, bulletphysics/bullet3, apple/foundationdb, grpc/grpc, microsoft/lightgbm, rui314/mold, oneapi-src/oneTBB, SerenityOS/serenity, tensorflow/tensorflow.
I found the 7 examples of missing parentheses at && within || in the matched strings (many of the matchings exist in preprocessor conditionals, which is out of the scope of this patch)
They are listed at the end of this comment.
The last case in tensorflow is an assert(a || b && "reason for the assert") known idiom and is handled correctly.
Because the newly-warned constants are compile-time constants not dependent on function arguments, the other 6 cases will also get warnings from the clang before this patch.

It suggests that this patch does not generate extensive new warnings in existent codebases.

oneTBB:
https://github.com/oneapi-src/oneTBB/blob/e6e493f96ec8b7e2e2b4d048ed49356eb54ec2a0/examples/common/gui/xvideo.cpp#L359
https://github.com/oneapi-src/oneTBB/blob/e6e493f96ec8b7e2e2b4d048ed49356eb54ec2a0/src/tbbmalloc/frontend.cpp#L1266
tensorflow:
https://github.com/tensorflow/tensorflow/blob/fb2c3b1e3140af73f949981d8428379cbb28228b/tensorflow/compiler/tf2tensorrt/convert/convert_nodes_test.cc#L4130
https://github.com/tensorflow/tensorflow/blob/fb2c3b1e3140af73f949981d8428379cbb28228b/tensorflow/compiler/tf2tensorrt/convert/convert_nodes_test.cc#L4074
https://github.com/tensorflow/tensorflow/blob/fb2c3b1e3140af73f949981d8428379cbb28228b/tensorflow/compiler/tf2tensorrt/convert/convert_nodes_test.cc#L9456
https://github.com/tensorflow/tensorflow/blob/fb2c3b1e3140af73f949981d8428379cbb28228b/tensorflow/compiler/jit/deadness_analysis.cc#L1423
https://github.com/tensorflow/tensorflow/blob/fb2c3b1e3140af73f949981d8428379cbb28228b/tensorflow/core/ir/tf_op_wrapper.cc#L25

So, based on best guess at least from grepping - you don't find any new instances of the warning in these code bases?

FWIW a lot of build systems support setting CXXFLAGS/CFLAGS before invoking the build system/build generator (cmake, for instance) and respects those - so might be relatively easy to add a new warning flag to the build (& CXX/CC to set the compiler to point to your local build with the patch/changes)

Thanks for your advice! I'll give it a try. It might be a nice opportunity for me to learn some compiler development methods.

So, based on best guess at least from grepping - you don't find any new instances of the warning in these code bases?

No, I did not find any.

The risk now is that this might significantly regress/add new findings for this warning that may not be sufficiently bug-finding to be worth immediate cleanup, causing users to have to choose between extensive lower-value cleanup and disabling the warning entirely.

Have you/could you run this over a significant codebase to see what sort of new findings the modified warning finds, to see if they're high quality bug finding, or mostly noise/check for whether this starts to detect certain idioms we want to handle differently?

It might be hard to find a candidate codebase that isn't already warning-clean with GCC (at least Clang/LLVM wouldn't be a good candidate because of this) & maybe that's sufficient justification to not worry too much about this outcome...

@aaron.ballman curious what your take on this might be

Thank you for the ping (and the patience waiting on my response)!

I think there's a design here that could make sense to me.

Issuing the diagnostic when there is a literal is silly because the literal value is never going to change. However, with a constant expression, the value could change depending on configuration. This begs the question of: what do we do with literals that are expanded from a macro? It looks like we elide the diagnostic in that case, but macros also imply potential configurability. So I think the design that would make sense to me is to treat macro expansions and constant expressions the same way (diagnose) and only elide the diagnostic when there's a (possibly string) literal. WDYT?

I would also like to see the diagnostic run over some reasonably large corpus of code as it may point out cases where our reasoning is incorrect that we can address. But I don't insist on it in this case given that this is also coming more in line with GCC's behavior (I've not tested how they handle the macro-expanding-to-a-literal case though).

The risk now is that this might significantly regress/add new findings for this warning that may not be sufficiently bug-finding to be worth immediate cleanup, causing users to have to choose between extensive lower-value cleanup and disabling the warning entirely.

Have you/could you run this over a significant codebase to see what sort of new findings the modified warning finds, to see if they're high quality bug finding, or mostly noise/check for whether this starts to detect certain idioms we want to handle differently?

It might be hard to find a candidate codebase that isn't already warning-clean with GCC (at least Clang/LLVM wouldn't be a good candidate because of this) & maybe that's sufficient justification to not worry too much about this outcome...

@aaron.ballman curious what your take on this might be

Thank you for the ping (and the patience waiting on my response)!

I think there's a design here that could make sense to me.

Issuing the diagnostic when there is a literal is silly because the literal value is never going to change. However, with a constant expression, the value could change depending on configuration. This begs the question of: what do we do with literals that are expanded from a macro? It looks like we elide the diagnostic in that case, but macros also imply potential configurability. So I think the design that would make sense to me is to treat macro expansions and constant expressions the same way (diagnose) and only elide the diagnostic when there's a (possibly string) literal. WDYT?

Yeah, I'm OK with that - though I also wouldn't feel strongly about ensuring we warn on the macro case too - if the incremental improvement to do constexpr values is enough for now and a note is left to let someone know they could expand it to handle macros.

But equally it's probably not super difficult to check if the literal is from a macro source location that differs from the source location of either of the operators, I guess? (I guess that check would be needed, so it doesn't warn when the macro is literally 'x && y || true' or the like.

I would also like to see the diagnostic run over some reasonably large corpus of code as it may point out cases where our reasoning is incorrect that we can address. But I don't insist on it in this case given that this is also coming more in line with GCC's behavior (I've not tested how they handle the macro-expanding-to-a-literal case though).

Yeah, ideally.

The risk now is that this might significantly regress/add new findings for this warning that may not be sufficiently bug-finding to be worth immediate cleanup, causing users to have to choose between extensive lower-value cleanup and disabling the warning entirely.

Have you/could you run this over a significant codebase to see what sort of new findings the modified warning finds, to see if they're high quality bug finding, or mostly noise/check for whether this starts to detect certain idioms we want to handle differently?

It might be hard to find a candidate codebase that isn't already warning-clean with GCC (at least Clang/LLVM wouldn't be a good candidate because of this) & maybe that's sufficient justification to not worry too much about this outcome...

@aaron.ballman curious what your take on this might be

Thank you for the ping (and the patience waiting on my response)!

I think there's a design here that could make sense to me.

Issuing the diagnostic when there is a literal is silly because the literal value is never going to change. However, with a constant expression, the value could change depending on configuration. This begs the question of: what do we do with literals that are expanded from a macro? It looks like we elide the diagnostic in that case, but macros also imply potential configurability. So I think the design that would make sense to me is to treat macro expansions and constant expressions the same way (diagnose) and only elide the diagnostic when there's a (possibly string) literal. WDYT?

Yeah, I'm OK with that - though I also wouldn't feel strongly about ensuring we warn on the macro case too - if the incremental improvement to do constexpr values is enough for now and a note is left to let someone know they could expand it to handle macros.

But equally it's probably not super difficult to check if the literal is from a macro source location that differs from the source location of either of the operators, I guess? (I guess that check would be needed, so it doesn't warn when the macro is literally 'x && y || true' or the like.

I mostly don't want to insist on dealing with macros in this patch, but it does leave the diagnostic behavior somewhat inconsistent to my mind. I think I can live without the macro functionality though, as this is still forward progress. And yes, you'd need to check the macro location against the operator location, I believe. Testing for a macro expansion is done with SourceLocation::isMacroID(), in case @hazohelet wants to try to implement that functionality as well.

The risk now is that this might significantly regress/add new findings for this warning that may not be sufficiently bug-finding to be worth immediate cleanup, causing users to have to choose between extensive lower-value cleanup and disabling the warning entirely.

Have you/could you run this over a significant codebase to see what sort of new findings the modified warning finds, to see if they're high quality bug finding, or mostly noise/check for whether this starts to detect certain idioms we want to handle differently?

It might be hard to find a candidate codebase that isn't already warning-clean with GCC (at least Clang/LLVM wouldn't be a good candidate because of this) & maybe that's sufficient justification to not worry too much about this outcome...

@aaron.ballman curious what your take on this might be

Thank you for the ping (and the patience waiting on my response)!

I think there's a design here that could make sense to me.

Issuing the diagnostic when there is a literal is silly because the literal value is never going to change. However, with a constant expression, the value could change depending on configuration. This begs the question of: what do we do with literals that are expanded from a macro? It looks like we elide the diagnostic in that case, but macros also imply potential configurability. So I think the design that would make sense to me is to treat macro expansions and constant expressions the same way (diagnose) and only elide the diagnostic when there's a (possibly string) literal. WDYT?

Yeah, I'm OK with that - though I also wouldn't feel strongly about ensuring we warn on the macro case too - if the incremental improvement to do constexpr values is enough for now and a note is left to let someone know they could expand it to handle macros.

But equally it's probably not super difficult to check if the literal is from a macro source location that differs from the source location of either of the operators, I guess? (I guess that check would be needed, so it doesn't warn when the macro is literally 'x && y || true' or the like.

I mostly don't want to insist on dealing with macros in this patch, but it does leave the diagnostic behavior somewhat inconsistent to my mind. I think I can live without the macro functionality though, as this is still forward progress. And yes, you'd need to check the macro location against the operator location, I believe. Testing for a macro expansion is done with SourceLocation::isMacroID(), in case @hazohelet wants to try to implement that functionality as well.

I ran the diagnostic over microsoft/lightgbm, oneapi-src/oneTBB, and rui314/mold builds. As a result, I found no new warnings from this patch.

To my surprise, both unpatched/patched clang does not issue the -Wlogical-op-parentheses warning for the following code I mentioned in the previous comment.

https://github.com/oneapi-src/oneTBB/blob/e6e493f96ec8b7e2e2b4d048ed49356eb54ec2a0/src/tbbmalloc/frontend.cpp#L1266

It is because clang does not issue warnings on x || y && z and x && y || z in the result of macro expansions as of now.
I found an issue on GitHub:
https://github.com/llvm/llvm-project/issues/19345

And yes, you'd need to check the macro location against the operator location, I believe. Testing for a macro expansion is done with SourceLocation::isMacroID(), in case @hazohelet wants to try to implement that functionality as well.

Thanks for your help. I think testing macro location against the operator is already handled in DiagnoseBinOpPrecedence, and is somewhat relevant to the issue above.

Anyway, I confirm no new instances of parentheses warning in the three repositories above.

dblaikie accepted this revision.Jan 12 2023, 9:48 AM

The risk now is that this might significantly regress/add new findings for this warning that may not be sufficiently bug-finding to be worth immediate cleanup, causing users to have to choose between extensive lower-value cleanup and disabling the warning entirely.

Have you/could you run this over a significant codebase to see what sort of new findings the modified warning finds, to see if they're high quality bug finding, or mostly noise/check for whether this starts to detect certain idioms we want to handle differently?

It might be hard to find a candidate codebase that isn't already warning-clean with GCC (at least Clang/LLVM wouldn't be a good candidate because of this) & maybe that's sufficient justification to not worry too much about this outcome...

@aaron.ballman curious what your take on this might be

Thank you for the ping (and the patience waiting on my response)!

I think there's a design here that could make sense to me.

Issuing the diagnostic when there is a literal is silly because the literal value is never going to change. However, with a constant expression, the value could change depending on configuration. This begs the question of: what do we do with literals that are expanded from a macro? It looks like we elide the diagnostic in that case, but macros also imply potential configurability. So I think the design that would make sense to me is to treat macro expansions and constant expressions the same way (diagnose) and only elide the diagnostic when there's a (possibly string) literal. WDYT?

Yeah, I'm OK with that - though I also wouldn't feel strongly about ensuring we warn on the macro case too - if the incremental improvement to do constexpr values is enough for now and a note is left to let someone know they could expand it to handle macros.

But equally it's probably not super difficult to check if the literal is from a macro source location that differs from the source location of either of the operators, I guess? (I guess that check would be needed, so it doesn't warn when the macro is literally 'x && y || true' or the like.

I mostly don't want to insist on dealing with macros in this patch, but it does leave the diagnostic behavior somewhat inconsistent to my mind. I think I can live without the macro functionality though, as this is still forward progress. And yes, you'd need to check the macro location against the operator location, I believe. Testing for a macro expansion is done with SourceLocation::isMacroID(), in case @hazohelet wants to try to implement that functionality as well.

I ran the diagnostic over microsoft/lightgbm, oneapi-src/oneTBB, and rui314/mold builds. As a result, I found no new warnings from this patch.

To my surprise, both unpatched/patched clang does not issue the -Wlogical-op-parentheses warning for the following code I mentioned in the previous comment.

https://github.com/oneapi-src/oneTBB/blob/e6e493f96ec8b7e2e2b4d048ed49356eb54ec2a0/src/tbbmalloc/frontend.cpp#L1266

It is because clang does not issue warnings on x || y && z and x && y || z in the result of macro expansions as of now.
I found an issue on GitHub:
https://github.com/llvm/llvm-project/issues/19345

Ah, thanks for doing the archaeology - and looks like there's an abandoned patch there that might be relevant to addressing some of those macro issues.

And yes, you'd need to check the macro location against the operator location, I believe. Testing for a macro expansion is done with SourceLocation::isMacroID(), in case @hazohelet wants to try to implement that functionality as well.

Thanks for your help. I think testing macro location against the operator is already handled in DiagnoseBinOpPrecedence, and is somewhat relevant to the issue above.

Anyway, I confirm no new instances of parentheses warning in the three repositories above.

Awesome :)

Seems reasonable to me, then - how about you, @aaron.ballman ?

This revision is now accepted and ready to land.Jan 12 2023, 9:48 AM
aaron.ballman accepted this revision.Jan 12 2023, 10:09 AM

Code LGTM but please add a release note for the change so folks are aware. Also, do you need someone to commit on your behalf? If so, what name and email address would you like used for patch attribution?

I added the release note.

Also, do you need someone to commit on your behalf? If so, what name and email address would you like used for patch attribution?

I don't have commit access. Would you please land this patch for me? Please use "Takuya Shimizu <shimizu2486@gmail.com>" for the patch attribution.
Thank you.

aaron.ballman closed this revision.Jan 13 2023, 5:23 AM

I added the release note.

Also, do you need someone to commit on your behalf? If so, what name and email address would you like used for patch attribution?

I don't have commit access. Would you please land this patch for me? Please use "Takuya Shimizu <shimizu2486@gmail.com>" for the patch attribution.
Thank you.

Thanks! I've landed on your behalf in 84e3fdc019412adc09b18a49063e006bd6e7efaa. (Closing manually as I forgot to add the Differential Revision tag to the commit message.)

clang/test/Sema/logical-op-parentheses.c