This is an archive of the discontinued LLVM Phabricator instance.

[Sema] -Wformat: support C23 format specifier %b %B
ClosedPublic

Authored by MaskRay on Aug 3 2022, 1:17 AM.

Details

Summary

Close #56885: WG14 N2630 added %b to fprintf/fscanf and recommended %B for
fprintf. This patch teaches -Wformat %b for the printf/scanf family of functions
and %B for the printf family of functions.

glibc 2.35 and latest Android bionic added %b/%B printf support. From
https://www.openwall.com/lists/libc-coord/2022/07/ no scanf support is available
yet.

Like GCC, we don't test library support.

GCC 12 -Wformat -pedantic emits a warning:

warning: ISO C17 does not support the ‘%b’ gnu_printf format [-Wformat=]

The behavior is not ported.

Note: freebsd_kernel_printf uses %b differently.

Diff Detail

Event Timeline

MaskRay created this revision.Aug 3 2022, 1:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2022, 1:17 AM
Herald added a subscriber: StephenFan. · View Herald Transcript
MaskRay requested review of this revision.Aug 3 2022, 1:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2022, 1:17 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
dim accepted this revision.Aug 3 2022, 10:58 AM
dim added a subscriber: emaste.

LGTM.

This revision is now accepted and ready to land.Aug 3 2022, 10:58 AM
dim added a comment.Aug 3 2022, 11:04 AM

GCC 12 -Wformat -pedantic emits a warning:

warning: ISO C17 does not support the ‘%b’ gnu_printf format [-Wformat=]

The behavior is not ported (and it's unclear to me how to implement it).

If you really want this, I think it can be implemented by looking at LangOpts::LangStd.

enh accepted this revision.Aug 3 2022, 11:06 AM

thanks!

Thanks for working on this! Please also add a release note for it.

I think there are changes missing for ScanfFormatString.cpp to handle bArg and BArg that should be handled at the same time. WDYT?

Thanks for working on this! Please also add a release note for it.

Will add:)

I think there are changes missing for ScanfFormatString.cpp to handle bArg and BArg that should be handled at the same time. WDYT?

I noticed this: https://github.com/llvm/llvm-project/issues/56885#issuecomment-1203636181

I think making scanf in the same patch makes sense. Let me check existing tests...

enh added a comment.Aug 3 2022, 11:43 AM

I think making scanf in the same patch makes sense. Let me check existing tests...

fwiw, most of the libcs seem to have been doing these separately because scanf is harder. i hope to have bionic's scanf done this week though.

(note in case you haven't already that there is no %B for scanf, just %b...)

I think making scanf in the same patch makes sense. Let me check existing tests...

fwiw, most of the libcs seem to have been doing these separately because scanf is harder. i hope to have bionic's scanf done this week though.

(note in case you haven't already that there is no %B for scanf, just %b...)

Whoops, you're right, that recommended practice I saw nearby was for fwprintf and not fscanf.

MaskRay updated this revision to Diff 449740.Aug 3 2022, 12:31 PM
MaskRay retitled this revision from [Sema] -Wformat: support C23 printf %b %B to [Sema] -Wformat: support C23 format specifier %b %B.
MaskRay edited the summary of this revision. (Show Details)

add scanf support

I think making scanf in the same patch makes sense. Let me check existing tests...

fwiw, most of the libcs seem to have been doing these separately because scanf is harder. i hope to have bionic's scanf done this week though.

(note in case you haven't already that there is no %B for scanf, just %b...)

Thanks for the reminder:) I subscribed to https://www.openwall.com/lists/libc-coord/2022/08/ but working on this issue led me to read the discussion:)

scanf("%B", i); // expected-warning{{invalid conversion specifier 'B'}} tests this.

GCC 12 -Wformat -pedantic emits a warning:

warning: ISO C17 does not support the ‘%b’ gnu_printf format [-Wformat=]

The behavior is not ported (and it's unclear to me how to implement it).

If you really want this, I think it can be implemented by looking at LangOpts::LangStd.

Something like !getLangOpts().C2X I suppose, but I do not find how to check both -Wformat and -Wpedantic.
Also, for the nature of the diagnostic, I think something in TableGen like def ext_... InGroup<C2x> would make sense but using a C23/C2x related -W option would deviate from the GCC behavior.
ISTM adding the diagnostic (even if we do) is not so necessary in this patch.

aaron.ballman accepted this revision.Aug 4 2022, 5:09 AM

ISTM adding the diagnostic (even if we do) is not so necessary in this patch.

I tend to agree. I think we'll want such a diagnostic at some point, and I think it's fine to deviate from the warning group GCC uses. We already have diagnostic grouping patterns for this sort of thing and we should use those consistently.

LGTM with some minor corrections to the release note. Thanks for this!

clang/docs/ReleaseNotes.rst
70–71
MaskRay updated this revision to Diff 450046.Aug 4 2022, 10:18 AM
MaskRay marked an inline comment as done.

Thanks for quick reviews!
Updated ReleaseNotes.rst

MaskRay edited the summary of this revision. (Show Details)Aug 4 2022, 10:19 AM
MaskRay edited the summary of this revision. (Show Details)Aug 4 2022, 10:26 AM
This revision was landed with ongoing or failed builds.Aug 4 2022, 10:26 AM
This revision was automatically updated to reflect the committed changes.
enh added a comment.Aug 4 2022, 10:29 AM

GCC 12 -Wformat -pedantic emits a warning:

warning: ISO C17 does not support the ‘%b’ gnu_printf format [-Wformat=]

The behavior is not ported (and it's unclear to me how to implement it).

If you really want this, I think it can be implemented by looking at LangOpts::LangStd.

Something like !getLangOpts().C2X I suppose, but I do not find how to check both -Wformat and -Wpedantic.
Also, for the nature of the diagnostic, I think something in TableGen like def ext_... InGroup<C2x> would make sense but using a C23/C2x related -W option would deviate from the GCC behavior.
ISTM adding the diagnostic (even if we do) is not so necessary in this patch.

note that that's not even what we'd want for Android anyway ... availability and behavior on Android is always[1] predicated on the API level, not the C standard. app developers can't avoid dealing with the API level, so we've avoided having an orthogonal versioning concern for them. (this is true in the headers too, not just behavior: on Android you always get the API-level-appropriate _library_ no matter what version of the language you select.)

and, like i said elsewhere, no-one's ever noticed that clang just assumes %m is always available. even though _i've_ wanted %b since i was a kid[2], i'm not actually expecting %b will be heavily used in practice. (tbh, i doubt %b will ever be used in scanf() outside of tests. as far as i'm concerned only the printf() side is actually useful...)


  1. the removal of gets() is the only exception that springs to mind.
  2. even though i know that for weird-ass machines like the PDP series, octal was more useful than it is to the rest of us, i still can't believe %b wasn't in B, let alone C!