This is an archive of the discontinued LLVM Phabricator instance.

Suppress -Wchar-subscripts if the index is a literal char
ClosedPublic

Authored by edward-jones on Mar 4 2019, 2:35 AM.

Details

Summary

Assume that the user knows what they're doing if they provide a char literal as an array index. This more closely matches the behavior of GCC.

Diff Detail

Event Timeline

edward-jones created this revision.Mar 4 2019, 2:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2019, 2:35 AM

Do you have some evidence that the current behavior is causing a lot of false positives in the wild? For ASCII character literals, I can sort of guess at why people might want to do this, but for things like wide character literals, or character literals relying on the current code page, etc, I'm less convinced.

Do you have some evidence that the current behavior is causing a lot of false positives in the wild? For ASCII character literals, I can sort of guess at why people might want to do this, but for things like wide character literals, or character literals relying on the current code page, etc, I'm less convinced.

I don't know about the false positive rate, just the one report on twitter which triggered me to submit this change. As for wide character literals I was under the impression that they would be promoted to integers and wouldn't have triggered the -Wchar-subscript anyway?

aaron.ballman accepted this revision.Mar 14 2019, 11:44 AM

Do you have some evidence that the current behavior is causing a lot of false positives in the wild? For ASCII character literals, I can sort of guess at why people might want to do this, but for things like wide character literals, or character literals relying on the current code page, etc, I'm less convinced.

I don't know about the false positive rate, just the one report on twitter which triggered me to submit this change. As for wide character literals I was under the impression that they would be promoted to integers and wouldn't have triggered the -Wchar-subscript anyway?

Ordinary character literals are promoted as well, aren't they? 'a' has type char, and that should promote up to int. My point was that you are silencing the warning on any character literal, not just ordinary character literals. However, I see now that -Wchar-subscript is very serious about char, rather than any character type, so I guess you don't need to distinguish between character literal kinds because the type system already deals with that for you.

LGTM!

This revision is now accepted and ready to land.Mar 14 2019, 11:44 AM
This revision was automatically updated to reflect the committed changes.
sberg added a subscriber: sberg.Nov 7 2019, 7:59 AM

But how about literals like '\x80' where the promoted value depends on whether plain char is signed or unsigned?

xbolva00 added a subscriber: xbolva00.EditedNov 7 2019, 9:10 AM

Well, i am not sure if one twitter report is good motivation to criple warning.

But how about literals like '\x80' where the promoted value depends on whether plain char is signed or unsigned?

If 'char' is signed and index into an array then this will typically trigger an -Warray-bounds warning because it references before the start of the array.

Well, i am not sure if one twitter report is good motivation to criple warning.

The motivation for suppressing the warning was that it is not uncommon to use a character literal in lookup tables. In addition in brings the warning inline with the behaviour of -Wchar-subscripts in GCC.

It was said Clang does not have to match gcc 1:1.

Just because someone uses weird patterns and instead of using pragma push/pop diagnostic in his/her code, should we really change it in Clang?

You should provide extra flag which controls this specific behaviour (and on by default) so user can use it to disable only this specific case.

Okay I've reverted this in rG90ecfa2f5f7f . I'll make improvements and resubmit this for review.

Well, i am not sure if one twitter report is good motivation to criple warning.

The motivation for suppressing the warning was that it is not uncommon to use a character literal in lookup tables. In addition in brings the warning inline with the behaviour of -Wchar-subscripts in GCC.

Peanut gallery says: If it's "not uncommon," then {when,if} you resubmit this for review, it should be easy and therefore mandatory to provide at least one example of someone actually using a character literal as a lookup table index. I've never seen such a construct in my life, so real-world examples (e.g. GitHub links) would be useful.

Especially useful would be links to GitHub examples of 'i'[arr] or arr['i'] where the most appropriate fix is legitimately "upgrade your Clang thus suppressing the diagnostic" as opposed to "rewrite the expression as arr[int('i')] or arr[105] depending on what you actually mean it to do."

int('i') is nice fixit we could emit in this case! Thanks for this idea!

sberg added a comment.Nov 7 2019, 11:59 PM

But how about literals like '\x80' where the promoted value depends on whether plain char is signed or unsigned?

If 'char' is signed and index into an array then this will typically trigger an -Warray-bounds warning because it references before the start of the array.

My thought was more that it might be useful as a kind of portability warning.

But how about literals like '\x80' where the promoted value depends on whether plain char is signed or unsigned?

If 'char' is signed and index into an array then this will typically trigger an -Warray-bounds warning because it references before the start of the array.

My thought was more that it might be useful as a kind of portability warning.

I'm not opposed to the warning per-se, but do you have evidence that the situation occurs in real-world code?

sberg added a comment.Nov 8 2019, 12:49 AM

But how about literals like '\x80' where the promoted value depends on whether plain char is signed or unsigned?

If 'char' is signed and index into an array then this will typically trigger an -Warray-bounds warning because it references before the start of the array.

My thought was more that it might be useful as a kind of portability warning.

I'm not opposed to the warning per-se, but do you have evidence that the situation occurs in real-world code?

No. (My original comment was driven by my, potentially false, assumption that this warning was originally, at least in part, meant to flag portability issues---along the lines of: why else would the warning trigger at all when char is unsigned.)