Page MenuHomePhabricator

[Sema] Suppress -Wformat diagnostics for bool types when printed using %hhd
ClosedPublic

Authored by erik.pilkington on Aug 27 2019, 6:18 PM.

Details

Summary

Instead, emit them under -Wformat-pedantic. This is particularly important for us because we have a lot of code that uses %hhd to print BOOL values, since on macOS BOOL is a typedef for signed char, and we emit fix-its recommending %hhd to print it. Now that this code is compiled on a target where BOOL is a typedef for _Bool, we're getting these -Wformat warnings. These aren't all that useful.

rdar://problem/54579473 Clang should not warn when using %hhd or %hd format specifiers on BOOL values

Diff Detail

Repository
rL LLVM

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptAug 27 2019, 6:18 PM

I'm wondering whether this should even warn pedantically. There are no format specifiers that apply directly to a _Bool datatype, so the user is left making a choice as to what specifier fits best. I think hhd and hhu are both equally reasonable types, as are the d and u groups directly -- I don't think we should warn on either of those specifiers with a _Bool argument. I could maybe see a case for pedantically diagnosing h, l, and ll prefixes with _Bool because that seems a bit type-confused to me. c as a specifier seems weird (given that false values will potentially terminate the string suddenly depending on terminal behavior, IIRC), but lc seems like type confusion again.

I'm wondering whether this should even warn pedantically. There are no format specifiers that apply directly to a _Bool datatype, so the user is left making a choice as to what specifier fits best. I think hhd and hhu are both equally reasonable types, as are the d and u groups directly -- I don't think we should warn on either of those specifiers with a _Bool argument. I could maybe see a case for pedantically diagnosing h, l, and ll prefixes with _Bool because that seems a bit type-confused to me. c as a specifier seems weird (given that false values will potentially terminate the string suddenly depending on terminal behavior, IIRC), but lc seems like type confusion again.

Hmm... on second though I think the l and ll prefixes are worth -Wformat proper, since that can result in printing a half-initialized integer when the argument ends up on the stack (we generate movl $0,(%rsp) for the first stack argument). I think %c is probably also worth a -Wformat, it seems like it would almost certainly be a mistake. I suppose we could pedantically warn on %hd, but I'm not sure. Who is the audience for -Wformat-pedantic? Users that expect it to diagnose any pedantically-UB printf calls will already be disappointed, since the C standard seems to be quite a lot more restrictive then us (7.21.6.9p9 says: "If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined." (I'm assuming "correct type" means "same type")). So I think we should just concern ourselves with calls that would lead to actual problems at runtime, or are likely programmer errors, and printf("%hd", (_Bool)0) doesn't seem to meet either of those bars.

I'm wondering whether this should even warn pedantically. There are no format specifiers that apply directly to a _Bool datatype, so the user is left making a choice as to what specifier fits best. I think hhd and hhu are both equally reasonable types, as are the d and u groups directly -- I don't think we should warn on either of those specifiers with a _Bool argument. I could maybe see a case for pedantically diagnosing h, l, and ll prefixes with _Bool because that seems a bit type-confused to me. c as a specifier seems weird (given that false values will potentially terminate the string suddenly depending on terminal behavior, IIRC), but lc seems like type confusion again.

Hmm... on second though I think the l and ll prefixes are worth -Wformat proper, since that can result in printing a half-initialized integer when the argument ends up on the stack (we generate movl $0,(%rsp) for the first stack argument).

Agreed.

I think %c is probably also worth a -Wformat, it seems like it would almost certainly be a mistake.

That's my feeling as well. At the very least, -Wformat-pedantic.

I suppose we could pedantically warn on %hd, but I'm not sure.

I think it makes sense to dos o.

Who is the audience for -Wformat-pedantic?

To me, the audience are people who want to catch semantically-benign-but-logically-inconsistent format strings. e.g., it's not wrong to pass 'c' to something formatted as %hd, but it smells fishy to do so.

Users that expect it to diagnose any pedantically-UB printf calls will already be disappointed, since the C standard seems to be quite a lot more restrictive then us (7.21.6.9p9 says: "If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined." (I'm assuming "correct type" means "same type")). So I think we should just concern ourselves with calls that would lead to actual problems at runtime, or are likely programmer errors, and printf("%hd", (_Bool)0) doesn't seem to meet either of those bars.

To me, actual problems at runtime belong in -Wformat and logical inconsistencies that don't cause runtime problems belong in -Wformat-pedantic. However, I think it's a defect that the C standard has no clearly UB-free way to print a _Bool value. I will bring this up on the reflectors to ask if we can clarify the standard here.

Address review comments.

To me, actual problems at runtime belong in -Wformat and logical inconsistencies that don't cause runtime problems belong in -Wformat-pedantic. However, I think it's a defect that the C standard has no clearly UB-free way to print a _Bool value. I will bring this up on the reflectors to ask if we can clarify the standard here.

The reflector discussion is still happening and there are issues with ambiguities that we are pretty sure we want to correct. I've got a paper out that touches on some of this: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2420.pdf

clang/include/clang/Basic/DiagnosticSemaKinds.td
8123 ↗(On Diff #219598)

How about: using '%0' format specifier, but argument has boolean value and then pass in the character specifier used?

clang/test/Sema/format-bool.c
26 ↗(On Diff #219598)

Just an FYI (not related to your patch): it seems that at least some people think this should be diagnosed as something other than by -Wformat-pedantic. Their thinking is that -Wformat-pedantic is for things that are required to have a diagnostic according to the standard but are not sufficiently interesting to warn about by default. This particular case is not required to be warned on by the standard, so it's not really a "pedantic" warning. It sounds like there may be interest in having -Wformat-pedantic for that understanding of pedantic, and introduce something like -Wformat-type-mismatch for these other cases where there is type confusion but not sufficiently dangerous to warrant warning by default?

37 ↗(On Diff #219598)

The diagnostic is both correct and incorrect. Maybe that's fine. However, == returns an int in C and int is reasonable to pass in to %c, so the diagnostic seems a bit strange when it claims the argument has a boolean value. Then again, the results really are boolean despite the type being an int.

I think I've convinced myself this is fine as-is. :-)

43 ↗(On Diff #219598)

Seems correct to me.

erik.pilkington marked 6 inline comments as done.

Update the diagnostic text.

The reflector discussion is still happening and there are issues with ambiguities that we are pretty sure we want to correct. I've got a paper out that touches on some of this: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2420.pdf

Nice, thanks for digging into this!

clang/include/clang/Basic/DiagnosticSemaKinds.td
8123 ↗(On Diff #219598)

Sure, copied that verbatim.

clang/test/Sema/format-bool.c
26 ↗(On Diff #219598)

That seems like a good idea to me, I agree that "pedantic" in the context of warnings means "technically incorrect according to the standard, but not really a big deal", which isn't really what -Wformat-pedantic is doing right now.

aaron.ballman accepted this revision.Sep 18 2019, 11:23 AM

LGTM, thank you for this!

This revision is now accepted and ready to land.Sep 18 2019, 11:23 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 18 2019, 12:07 PM