This is an archive of the discontinued LLVM Phabricator instance.

[Diagnostics] Diagnose misused xor as pow
ClosedPublic

Authored by xbolva00 on Jun 17 2019, 5:39 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

xbolva00 created this revision.Jun 17 2019, 5:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2019, 5:39 AM
xbolva00 marked an inline comment as done.Jun 17 2019, 5:40 AM
xbolva00 added inline comments.
include/clang/Basic/DiagnosticGroups.td
508 ↗(On Diff #205049)

GCC folks will do this diagnostic soon too, so I will wait and use their name.

xbolva00 updated this revision to Diff 205058.Jun 17 2019, 6:11 AM

More tests

Should we also emit a note how to silence it, eg. swap xor operands or add parentheses around 2/10 ?

xbolva00 updated this revision to Diff 205089.Jun 17 2019, 8:43 AM

Silence note added.

I think you want to avoid warning inside macros as well, say:

#define AWESOME(x, y) ({ blah blah blah; x ^ y; blah })

AWESOME(2, 10); // probably not a bug
test/SemaCXX/warn-xor-as-pow.cpp
13 ↗(On Diff #205089)

2 ^ 0 seems like it would be a bug too?

jfb added a comment.Jun 17 2019, 9:35 AM

What does the suggested for for 2 ^ 32 look like? I hope it's not 1 << 32 :)

xbolva00 updated this revision to Diff 205121.Jun 17 2019, 10:48 AM

Fixed notes by @jfb. Thank you.

xbolva00 marked an inline comment as done.Jun 17 2019, 10:48 AM
jfb added a comment.Jun 17 2019, 10:57 AM

Why have .c and .cpp tests?

test/Sema/warn-xor-as-pow.c
43 ↗(On Diff #205121)

I think we want to suggest 1LL << 32 or something like that. Whatever we usually do with this type of suggestion.

45 ↗(On Diff #205121)

This one hits a ceiling, we can't really suggest anything for this value IMO. Maybe we need to see if the user is doing 2^64 - 1? In that case we can offer a suggestion.

xbolva00 updated this revision to Diff 205126.Jun 17 2019, 10:58 AM
xbolva00 marked an inline comment as done.Jun 17 2019, 11:04 AM
xbolva00 added inline comments.
test/Sema/warn-xor-as-pow.c
45 ↗(On Diff #205121)

Since this requires changes in other parts of code to catch 2^64 - 1, maybe we can leave it as a follow-up patch?

jfb added inline comments.Jun 17 2019, 11:06 AM
test/Sema/warn-xor-as-pow.c
45 ↗(On Diff #205121)

Sure.

Seems low-value at first glance, but I guess I can't argue with those Twitter and codesearch results.

Do you have codesearch results for ^2? https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=%5E+2&search=Search Seems like a lot of false positives (and a lot of code comments turning up in codesearch??), but would be perhaps even more valuable to the kinds of users who would write 10^x or x^2.

What is going on in this case, and would a warning here be a false positive or a true positive? https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=10+%5E+7&search=Search

lib/Sema/SemaExpr.cpp
10913 ↗(On Diff #205121)

Why do you special-case 0b prefix (or suffix — 0x0b is also special-cased by this snippet), but you don't special-case octal or hexadecimal constants? I'm sure x ^ 0x1234 will be a more common spelling than x ^ 0b1001000110100.

10924 ↗(On Diff #205121)

Do you have metrics indicating that this line is an improvement over if (true) {?

10959 ↗(On Diff #205121)

I don't understand why parenthesizing one argument should silence the warning. Wouldn't it be more natural to suggest converting both arguments to hexadecimal or binary? That is, convert 10 ^ x to 0xA ^ x, or 2 ^ 16 to 0x2 ^ 0x10.

test/Sema/warn-xor-as-pow.c
38 ↗(On Diff #205121)

Please add test cases for 2 ^ 0x4 and 2 ^ 04 and 0x2 ^ 10 and 02 ^ 10.

39 ↗(On Diff #205121)

I don't understand why this line doesn't warn. Is it because the macro's name has the case-insensitive string xor in it? Is it because there is a macro involved at all? In either case, please add another test case for res = FOOBAR(2, 16).

Also res = EPSILON where #define EPSILON 10^-300. That seems to come up in the codesearch results a lot.

xbolva00 marked 2 inline comments as done.Jun 17 2019, 11:37 AM
xbolva00 added inline comments.
lib/Sema/SemaExpr.cpp
10924 ↗(On Diff #205121)

This is left over of older code :) I will fix it.

test/Sema/warn-xor-as-pow.c
39 ↗(On Diff #205121)

EPSILON 10^-300

this is in macro :(

so maybe if negative RHS, we should check macro too? @jfb

xbolva00 marked 3 inline comments as done.Jun 17 2019, 11:42 AM
xbolva00 updated this revision to Diff 205140.Jun 17 2019, 11:48 AM

Removed useless else block with return.

xbolva00 marked an inline comment as done.Jun 17 2019, 11:58 AM
xbolva00 added inline comments.
test/SemaCXX/warn-xor-as-pow.cpp
85 ↗(On Diff #205140)

What should we suggest here ? :) Nevative shift ? think we should diagnose it.

I am not sure about 10 ^ 0 case... warn or not ?

@jfb

Just wanted to say that I think I agree with the design choices here!

xbolva00 updated this revision to Diff 205187.Jun 17 2019, 2:19 PM

Handle 10 ^ -C case..

lib/Sema/SemaExpr.cpp
10931 ↗(On Diff #205187)

Can you use starts_with (or the LLVM equivalent) in both of these cases? It'll be faster and also more correct.

Hex and binary are handled up here on line 10927, but octal is handled down on line 10955; why? Can't they be combined into one place in the code?

10982 ↗(On Diff #205187)

Suppressing the warning specifically on 10 ^ 0 (which means 10) seems like an anti-feature.

10988 ↗(On Diff #205187)

I suggest at least one unit test case that involves 10 ^ 100, and considering the user-friendliness of the resulting error message. How about suggesting "1e" + std::to_string(RightSideIntValue) as the fixit in both cases?

test/SemaCXX/warn-xor-as-pow.cpp
52 ↗(On Diff #205187)

I still expect to see a warning either on this line, or on the line where the macro is defined. I don't see why we'd want to be silent in this case.

Thanks, I will address your notes soon!

The only remaining question is, as your said, whether or not to diagnose xors in macros. @regerh @jfb ?

xbolva00 marked an inline comment as done.Jun 17 2019, 3:20 PM
xbolva00 added inline comments.
lib/Sema/SemaExpr.cpp
10931 ↗(On Diff #205187)

We cannot use starts_with here, case: 2 ^ 0b11.

Yes, I can combine it to one place.

The only remaining question is, as your said, whether or not to diagnose xors in macros. @regerh @jfb ?

IMO it's better to not warn about xors in macros-- I like the current tradeoffs.

xbolva00 updated this revision to Diff 205198.Jun 17 2019, 3:35 PM

Addressed review notes.

The only remaining question is, as your said, whether or not to diagnose xors in macros. @regerh @jfb ?

IMO it's better to not warn about xors in macros-- I like the current tradeoffs.

Thank you for opinion.

I think this patch is (almost) done. :)

xbolva00 updated this revision to Diff 205202.Jun 17 2019, 3:39 PM

Removed unused diagnostic message.

xbolva00 marked an inline comment as done.Jun 17 2019, 3:44 PM
xbolva00 added inline comments.
test/SemaCXX/warn-xor-as-pow.cpp
13 ↗(On Diff #205089)

Small question..

Instead of 1<<0 for this case, I emit fixit "1", seems ok for you?

aaron.ballman added inline comments.Jun 19 2019, 9:05 AM
include/clang/Basic/DiagnosticSemaKinds.td
3306 ↗(On Diff #205202)

We usually use "did you mean" in these kinds of diagnostics. Replace the comma with a semicolon.

Same comments below.

lib/Sema/SemaExpr.cpp
10895 ↗(On Diff #205202)

Missing a full stop at the end of the comment. Same comment applies elsewhere.

The only remaining question is, as your said, whether or not to diagnose xors in macros. @regerh @jfb ?

IMO it's better to not warn about xors in macros-- I like the current tradeoffs.

Do you mind explaining why? I would have expected to diagnose an xor within a macro because the code is just as wrong there, but maybe I'm missing something.

We will be noisy in this case

#define IRQ_CHINT4_LEVEL        (11 ^ 7)
#define IRQ_CHINT3_LEVEL        (10 ^ 7)
#define IRQ_CHINT2_LEVEL        (9 ^ 7)
#define IRQ_CHINT1_LEVEL        (8 ^ 7)

We will be noisy in this case

#define IRQ_CHINT4_LEVEL        (11 ^ 7)
#define IRQ_CHINT3_LEVEL        (10 ^ 7)
#define IRQ_CHINT2_LEVEL        (9 ^ 7)
#define IRQ_CHINT1_LEVEL        (8 ^ 7)

Sure, but do we not want to be noisy in a case like: #define BAD_IDEA (2 ^ 16)?

... But I go thru codesearch and this is just one case I think.

I think we should warn in macros... In codesearch case ,they are many more true positives, then false negatives.

#define MAX_T_U32 ((2^32)-1)

Ouch.

... But I go thru codesearch and this is just one case I think.

I think we should warn in macros... In codesearch case ,they are many more true positives, then false negatives.

#define MAX_T_U32 ((2^32)-1)

Ouch.

That's my rationale as well. If it happens outside of macros and is bad, it's probably just as likely to happen within a macro and be equally as bad. I think we probably should warn within macros.

xbolva00 updated this revision to Diff 205636.Jun 19 2019, 10:27 AM

Added support to warn in macros.
Renamed to -Wxor-used-as-pow.

jfb added a comment.Jun 19 2019, 10:29 AM

What I meant with macros was that I don't think we should warn on:

#define LEGIT(a, b) ({ work work work; a ^ b; work work work; })

LEGIT(10, 5);

If the constants are inline in the macros then sure, warn there. Basically: if you literally wrote CONSTANT ^ CONSTANT and it looks fishy, let's warn. If token-pasting wrote it for you and it looks fishy, let's not warn.

In D63423#1550697, @jfb wrote:

What I meant with macros was that I don't think we should warn on:

#define LEGIT(a, b) ({ work work work; a ^ b; work work work; })

LEGIT(10, 5);

If the constants are inline in the macros then sure, warn there. Basically: if you literally wrote CONSTANT ^ CONSTANT and it looks fishy, let's warn. If token-pasting wrote it for you and it looks fishy, let's not warn.

So you would not want to see this code diagnosed?

#define POW(x, y) (x ^ y)

void f() {
  int two_to_the_derp = POW(2, 16);
}

I'm not certain I understand the rationale for why we would not want to diagnose such a construct. Do we have reason to expect a lot of false positives?

xbolva00 added a comment.EditedJun 19 2019, 10:41 AM
In D63423#1550697, @jfb wrote:

What I meant with macros was that I don't think we should warn on:

#define LEGIT(a, b) ({ work work work; a ^ b; work work work; })

LEGIT(10, 5);

If the constants are inline in the macros then sure, warn there. Basically: if you literally wrote CONSTANT ^ CONSTANT and it looks fishy, let's warn. If token-pasting wrote it for you and it looks fishy, let's not warn.

Is it worth to pattern match this case? I don't like how this diagnostic is getting more and more complicated.

User can use -Wno-xor-used-as-pow if auto-generated weird code.

Or I can move it under -Wextra..

jfb added a comment.Jun 19 2019, 10:43 AM
In D63423#1550697, @jfb wrote:

What I meant with macros was that I don't think we should warn on:

#define LEGIT(a, b) ({ work work work; a ^ b; work work work; })

LEGIT(10, 5);

If the constants are inline in the macros then sure, warn there. Basically: if you literally wrote CONSTANT ^ CONSTANT and it looks fishy, let's warn. If token-pasting wrote it for you and it looks fishy, let's not warn.

So you would not want to see this code diagnosed?

#define POW(x, y) (x ^ y)

void f() {
  int two_to_the_derp = POW(2, 16);
}

I'm not certain I understand the rationale for why we would not want to diagnose such a construct. Do we have reason to expect a lot of false positives?

If you wrote CONSTANT_2_OR_10 ^ CONSTANT I'm pretty sure you messed up. Macros are opaque, and I'm not sure you clearly messed up anymore. Let's take the clear win, see what falls out, and only expand our reach if we have reason to do so. Without trying on huge codebases I don't know whether we'll get a lot of false positive. I'd rather be conservative and not cause headache.

I am also in the "be a bit conservative at first and see how things shake out" camp, but it's y'all doing the hard work here, not me :)

xbolva00 updated this revision to Diff 205641.Jun 19 2019, 10:50 AM

Reverted macro support.

Okey, it should be fine now.

If you are agree with the current state, please approve it.

I've always been frustrated at how clang just gives up as soon as it sees a macro.
At best that should be controlled with some command-line flag.

jfb added a comment.Jun 19 2019, 10:54 AM

I've always been frustrated at how clang just gives up as soon as it sees a macro.
At best that should be controlled with some command-line flag.

This patch is not the place to change common clang behavior. I'm advocating for what we usually do: be conservative in macros because token-pasting tends to look "wrong" but not actually be wrong as much (i.e. false positives). If you want clang to do something different, an RFC to cfe-dev is the right place for such a discussion.

xbolva00 added a comment.EditedJun 19 2019, 10:58 AM

I've always been frustrated at how clang just gives up as soon as it sees a macro.
At best that should be controlled with some command-line flag.

I cannot promise anything, but maybe I could do -Wxor-used-as-pow-in-macro (off by default) so people could try it and see how noisy it would be :)

Edit: Oh, probably no, I would have to duplicate items in DiagnosticSemaKinds :(

In D63423#1550728, @jfb wrote:

I've always been frustrated at how clang just gives up as soon as it sees a macro.
At best that should be controlled with some command-line flag.

This patch is not the place to change common clang behavior. I'm advocating for what we usually do: be conservative in macros because token-pasting tends to look "wrong" but not actually be wrong as much (i.e. false positives). If you want clang to do something different, an RFC to cfe-dev is the right place for such a discussion.

I don't think i'm proposing to change the entirety of warnings to warn on macros, am i?
Likewise, is it a causality/practice to ignore macros, or a documented requirement [that would require RFC for changing]?
It's been awhile since i've read through the entirety of docs, but i don't recall seeing one.

I've always been frustrated at how clang just gives up as soon as it sees a macro.
At best that should be controlled with some command-line flag.

I cannot promise nothing, but maybe I could do -Wxor-used-as-pow-in-macro (off by default) so people could try it and see how noisy it would be :)

I'm not sure -Wflag + -Wflag-in-macros will scale :/
i was thinking more along the lines of warn-for-system-headers flag.

In D63423#1550713, @jfb wrote:
In D63423#1550697, @jfb wrote:

What I meant with macros was that I don't think we should warn on:

#define LEGIT(a, b) ({ work work work; a ^ b; work work work; })

LEGIT(10, 5);

If the constants are inline in the macros then sure, warn there. Basically: if you literally wrote CONSTANT ^ CONSTANT and it looks fishy, let's warn. If token-pasting wrote it for you and it looks fishy, let's not warn.

So you would not want to see this code diagnosed?

#define POW(x, y) (x ^ y)

void f() {
  int two_to_the_derp = POW(2, 16);
}

I'm not certain I understand the rationale for why we would not want to diagnose such a construct. Do we have reason to expect a lot of false positives?

If you wrote CONSTANT_2_OR_10 ^ CONSTANT I'm pretty sure you messed up. Macros are opaque, and I'm not sure you clearly messed up anymore. Let's take the clear win, see what falls out, and only expand our reach if we have reason to do so. Without trying on huge codebases I don't know whether we'll get a lot of false positive. I'd rather be conservative and not cause headache.

IIRC, we do not usually go out of our way to stop diagnosing code within macro expansions unless there's a reason to do so. Given that we know the code is likely wrong (because we're only checking constants whether inside or outside of a macro expansion) and that we've seen the incorrect code happen inside of macros in the wild, I suspect that being conservative is not much of a win. That said, I definitely understand the desire to not issue burdensome amounts of false positives. If we see false positives coming from macros, we should address them (perhaps even by not diagnosing within a macro). Perhaps the author can run the check over a large corpus of code to see whether fps come up in practice? (The presence of fps will suggest to not warn in macros, but the absence of fps won't tell us too much.)

xbolva00 updated this revision to Diff 205646.Jun 19 2019, 11:17 AM

Added dots in the comments.

xbolva00 added a comment.EditedJun 19 2019, 11:21 AM

Perhaps the author can run the check over a large corpus of code to see whether fps come up in practice? (The presence of fps will suggest to not warn in macros, but the absence of fps won't tell us too much.)

Sorry, I have no time nor spare computers to check big codebases. As @jfb said, let's start with a conservative approach and possibly improve it in the future.

Perhaps the author can run the check over a large corpus of code to see whether fps come up in practice? (The presence of fps will suggest to not warn in macros, but the absence of fps won't tell us too much.)

Sorry, I have no time nor spare computers to check big codebases. As @jfb said, let's start with a conservative approach and possibly improve it in the future.

I can live with that, but my concern is that we often don't do that "improve it in the future" bit and we already know about buggy cases in the wild where the diagnostic will not trigger. Given how often people use macros for simple, but wrong, constructs like my POW example, I think this reduces the utility of the check.

What are the GCC folks planning to do with macros?

David Malcon - GCC: “I think we'd want to *not* warn if either of the operands are from a macro expansion.“

But I see no reason to follow this and why we should restrict this even more..

xbolva00 updated this revision to Diff 205692.Jun 19 2019, 3:42 PM

Improved code, added const.

Any comments? If no, please approve..

jfb added inline comments.Jun 21 2019, 9:24 AM
include/clang/Basic/DiagnosticGroups.td
508 ↗(On Diff #205692)

Does this match the naming that GCC ended up with?

include/clang/Basic/DiagnosticSemaKinds.td
3309 ↗(On Diff #205692)

This is still a bad diagnostic, I'd rather see it fixed.

lib/Sema/SemaExpr.cpp
10929 ↗(On Diff #205692)

Doesn't this match any expression that contains xor? Put another way, I don't see "xor" used anywhere else under clang, what's usually done?

10951 ↗(On Diff #205692)

This ignores 0B.

10954 ↗(On Diff #205692)

This ignores 0X.

10961 ↗(On Diff #205692)

The above seems to mishandle user-defined literals. Unless the UDL contains 0x or 0b in them. I'd like to see this tested.

lib/Sema/SemaExpr.cpp
10950 ↗(On Diff #205692)

I would rather see this whole section as a three-liner:

// Do not diagnose binary, octal, or hexadecimal literals.
if (StringRef(LHSStr).startswith("0") || StringRef(RHSStr).startswith("0"))
    return;
10963 ↗(On Diff #205692)

Here and on line 10971, I suggest whitespace around the << (in the suggested expression).

10983 ↗(On Diff #205692)

Do you currently warn on, let's say, 2uLL ^ 7? and if so, do we care that the suggested expression 0x2 ^ 7 has a different type? I imagine the answer to "do we care" is "no," but I thought I should ask.

xbolva00 marked 3 inline comments as done.Jun 21 2019, 9:31 AM
xbolva00 added inline comments.
include/clang/Basic/DiagnosticGroups.td
508 ↗(On Diff #205692)

No, they have no progress yet. GCC should not block our progress here..

include/clang/Basic/DiagnosticSemaKinds.td
3309 ↗(On Diff #205692)

Why so? Can you propose wording here?

We provide a fixit for it, e.g. 1LL<<32..

lib/Sema/SemaExpr.cpp
10929 ↗(On Diff #205692)

Yes, but since xor is keyword in C++, I think this is safe.

xbolva00 marked an inline comment as done.Jun 21 2019, 9:32 AM
xbolva00 added inline comments.
lib/Sema/SemaExpr.cpp
10983 ↗(On Diff #205692)

We check if bidwith is same.. Bail out if not.

lib/Sema/SemaExpr.cpp
10929 ↗(On Diff #205692)

I believe JF is worried about expressions like constexpr int flexor_exponent = 3; return 10 ^ flexor_exponent;. That expression contains the substring "xor" but does not use the xor keyword. Which reminds me, can you add some test cases showing what behavior you expect for operands that are not integer literals but still compile-time const, or constexpr, or template arguments?

xbolva00 marked an inline comment as done.Jun 21 2019, 9:45 AM
xbolva00 added inline comments.
lib/Sema/SemaExpr.cpp
10929 ↗(On Diff #205692)

Ah, I will change it to “ xor “.

jfb added inline comments.Jun 21 2019, 9:47 AM
lib/Sema/SemaExpr.cpp
10929 ↗(On Diff #205692)

Spaces aren't the only valid whitespace character :)

2 xor
   31
xbolva00 marked an inline comment as done.Jun 21 2019, 10:09 AM
xbolva00 added inline comments.
lib/Sema/SemaExpr.cpp
10929 ↗(On Diff #205692)

I will rework it to chech only XOR op, not the while expr.

xbolva00 marked an inline comment as done.Jun 21 2019, 10:27 AM
xbolva00 added inline comments.
lib/Sema/SemaExpr.cpp
10929 ↗(On Diff #205692)

Well, 10 ^ flexor ..

We cannot reach xor check, flexor is not a integer literal. So I will rework nothing here.

aaron.ballman added inline comments.Jun 21 2019, 10:30 AM
lib/Sema/SemaExpr.cpp
10929 ↗(On Diff #205692)
#define flexor 7

Now flexor is an integer literal. (A test case for this would be beneficial.)

jfb added inline comments.Jun 21 2019, 10:32 AM
lib/Sema/SemaExpr.cpp
10929 ↗(On Diff #205692)

Also, as I've said above:

constexpr long long operator"" _xor(unsigned long long v)
{ return v; }
auto i = 10 ^ 5_xor;
xbolva00 updated this revision to Diff 214523.Aug 10 2019, 5:46 AM

Addressed review notes.

xbolva00 marked 7 inline comments as done.Aug 10 2019, 5:47 AM

Hello,

I fixed your notes + added test cases.

Only remaining question is what to do in "2 ^ 64" (65, 66, .. etc) case. I can emit "result of 2 ^ 64" is "XYZ", but this is rather poor diagnostic. What should we suggest for "2 ^ 64"? (@jfb

The "2 ^ 64 - 1" case to be solved as follow up patch..

And since there is no progress on GCC side, I will use -Wxor-used-as-pow

xbolva00 updated this revision to Diff 214524.Aug 10 2019, 5:53 AM

svn added forgotten test file..

xbolva00 updated this revision to Diff 214525.Aug 10 2019, 5:56 AM
xbolva00 updated this revision to Diff 214533.Aug 10 2019, 8:22 AM

Added UDL tests

aaron.ballman added inline comments.Aug 10 2019, 9:11 AM
include/clang/Basic/DiagnosticGroups.td
508 ↗(On Diff #205692)

Agreed, but we should still coordinate to ensure the names we pick are the same since this is a user-facing option. If GCC isn't adding this yet, can you file a bug report (or comment on an existing one) with the name we've chosen?

lib/Sema/SemaExpr.cpp
10977–10980 ↗(On Diff #214533)

if (const auto *UO = dyn_cast<UnaryOperator>(RHS.get())) { ... } rather than doing an isa<> followed by a cast<>.

10997 ↗(On Diff #214533)

Why is this C++ only? C has an xor macro that comes from iso646.h. Should add a test case for this situation.

xbolva00 updated this revision to Diff 214534.Aug 10 2019, 9:23 AM

Addressed review notes

xbolva00 marked 2 inline comments as done.Aug 10 2019, 9:24 AM
xbolva00 marked an inline comment as done.
xbolva00 added inline comments.
include/clang/Basic/DiagnosticGroups.td
508 ↗(On Diff #205692)

I will inform them after patch approval.

Any futher comments or is it OK now @jfb ? @aaron.ballman ?

aaron.ballman accepted this revision.Aug 16 2019, 6:39 AM

This LGTM aside from a few nits, but please give a chance for other reviewers to weigh in on their comments.

lib/Sema/SemaExpr.cpp
11025–11031 ↗(On Diff #214534)

Might as well combine these.

test/SemaCXX/warn-xor-as-pow.cpp
100–102 ↗(On Diff #214534)

I'd also like to see a case like res = 10_xor ^ 5;

This revision is now accepted and ready to land.Aug 16 2019, 6:39 AM
xbolva00 updated this revision to Diff 215592.Aug 16 2019, 7:04 AM

Fixed nits.

xbolva00 marked 2 inline comments as done.Aug 16 2019, 7:05 AM
xbolva00 updated this revision to Diff 215693.Aug 16 2019, 3:30 PM

Better comparison for "xor".

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2019, 12:13 PM
hvdijk added a subscriber: hvdijk.EditedMay 2 2021, 9:26 AM

It's bad enough that this warns for

#define A 2
int f() { return A ^ 1; }

where as far as the users of A are concerned, A is just some arbitrary value, it's just random chance it happens to be two and there is no chance of it being misinterpreted as exponentiation, but it's much worse that this cannot be worked around by changing the definition of A to 0x2, the only way to suppress the warning with a hexadecimal constant is by explicitly using that constant in the use, i.e. by changing the function to int f() { return 0x2 ^ 1; }, which is very much inappropriate when the users of A are not supposed to care about which specific value it has.

I'd like to see this changed so that at least the #define A 0x2 case no longer triggers the warning, but ideally I'd like to see no warning for this example in its current state either. I see in the test case that that warning is intentional, but I'm not seeing from the previous discussion whether this case was based on real world programmer errors or not. Do you recall? If it was, then fair enough, I'll propose to leave that as it is.

(Edit: D97445 changed it so that the concrete example I used before no longer warns; I've edited my comment to use a different example.)

It's bad enough that this warns for

#define A 2
int f() { return A ^ 1; }

where as far as the users of A are concerned...

I see how this warning is arguably overzealous in the very special case of "raising" to the constant 1 (since nobody would ever write that by accident). However, if the user of A wants to indicate that they understand this is bitwise-xor, they can simply change the body of their f to return A ^ 0x1; — hex notation suppresses the warning. (Changing the definition of A as well is perhaps a good idea but not technically required.) IMO this is good enough and we should leave it. (What do you think, having seen the A ^ 0x1 workaround? Does that sufficiently address your needs?)

[...] I'm not seeing from the previous discussion whether this case was based on real world programmer errors or not

The description links to a couple of tweets showing examples from the wild:
https://twitter.com/jfbastien/status/1139298419988549632
https://twitter.com/mikemx7f/status/1139335901790625793

hvdijk added a comment.May 2 2021, 9:50 AM

It's bad enough that this warns for

#define A 2
int f() { return A ^ 1; }

where as far as the users of A are concerned...

I see how this warning is arguably overzealous in the very special case of "raising" to the constant 1 (since nobody would ever write that by accident). However, if the user of A wants to indicate that they understand this is bitwise-xor, they can simply change the body of their f to return A ^ 0x1; — hex notation suppresses the warning. (Changing the definition of A as well is perhaps a good idea but not technically required.) IMO this is good enough and we should leave it. (What do you think, having seen the A ^ 0x1 workaround? Does that sufficiently address your needs?)

I missed that hex notation for the RHS also suppresses the warning. Thanks, that's good to know. Combined with the fact that the case where LHS and RHS are both macros now no longer triggers the warning, that means one of LHS or RHS must be an integer literal, so there will always be a viable workaround once clang 13 is released. That may be good enough.

[...] I'm not seeing from the previous discussion whether this case was based on real world programmer errors or not

The description links to a couple of tweets showing examples from the wild:
https://twitter.com/jfbastien/status/1139298419988549632
https://twitter.com/mikemx7f/status/1139335901790625793

Those are different cases, they literally have 2 ^ in there. In the case I was asking about, the 2 comes from a macro expansion, but the ^ does not. That is much less likely to be a mistake, and one where I didn't see whether the warning for that was based on real-world concerns.

I remember that was interest to support macros too :) tbh I cant remember now such real world case where “macro ^ cst” was an issue but.. it was a long time ago ;)

If you are want, you can send patch to to avoid warning in this case, we can discuss it.

I remember that was interest to support macros too :) tbh I cant remember now such real world case where “macro ^ cst” was an issue but.. it was a long time ago ;)

If you are want, you can send patch to to avoid warning in this case, we can discuss it.

Will do; the change is trivial after D97445: it's just changing that to use || rather than &&. I'll do that and update the tests to match. I don't know whether that'll be what we want, but it may be easier to talk about once we have a good picture from the test case update what it affects.

No, we want to preserve warning for 2 ^ MACRO or 10 ^ MACRO.

hvdijk added a comment.EditedMay 2 2021, 10:32 AM

No, we want to preserve warning for 2 ^ MACRO or 10 ^ MACRO.

Perhaps that should warn even if the RHS is in hex form, or is an enumerator constant, or is not even constant at all.

libpng: #if-ed out pseudocode but just waiting for someone to copy and paste: for (i=11;i>=0;--i){ print i, " ", (1 - e(-(2^i)/65536*l(2))) * 2^(32-i), "\n"}.

https://github.com/mongodb/mongo/blob/master/src/third_party/IntelRDFPMathLib20U1/LIBRARY/float128/mphoc_macros.h:

#define MPHOC_MAX_CHAR  ((2 ^ (BITS_PER_CHAR - 1)) - 1)
#define MPHOC_MAX_SHORT ((2 ^ (BITS_PER_SHORT - 1)) - 1)
#define MPHOC_MAX_INT   ((2 ^ (BITS_PER_INT - 1)) - 1)
#define MPHOC_MAX_LONG  ((2 ^ (BITS_PER_LONG - 1)) - 1)
#define MPHOC_MAX_WORD  ((2 ^ (BITS_PER_WORD - 1)) - 1)

(and many more in there)

Perhaps that should warn even if the RHS is in hex form

It would be kinda strange, since in one clang release we ask users to silence warning with hex form and newer release would warn anyway. Not a fan of this decision.

, or is an enumerator constant, or

This looks like a good idea. +1.

is not even constant at all.

Depends, needs to be carefully evaluated (true positives vs false positives ratio).

Perhaps that should warn even if the RHS is in hex form

It would be kinda strange, since in one clang release we ask users to silence warning with hex form and newer release would warn anyway. Not a fan of this decision.

Fair point, but the clang suggestion is always to turn the LHS into hex form, never to turn the RHS in hex form, so if users followed clang's suggested silencing, it would continue to work. That's how I didn't even notice that I could change the RHS to hex form in the case I asked about.

And fully agreed that if we warn about any cases where we didn't warn before, we need to be very careful about the risk of false positives. When I create the new RFC patch, I'll try to get details on that and include it in the description.

When I create the new RFC patch, I'll try to get details on that and include it in the description.

Great! Thanks.