This is an archive of the discontinued LLVM Phabricator instance.

[clang][Sema] Fix format size estimator's handling of %o, %x, %X with alternative form
ClosedPublic

Authored by hazohelet on Aug 29 2023, 2:08 PM.

Details

Summary

The wrong handling of %x specifier with alternative form causes a false positive in linux kernel (https://github.com/ClangBuiltLinux/linux/issues/1923#issuecomment-1696075886)

The kernel code: https://github.com/torvalds/linux/blob/651a00bc56403161351090a9d7ddbd7095975324/drivers/media/pci/cx18/cx18-mailbox.c#L99

This patch fixes this handling, and also adds some standard wordings as comments to clarify the reason.

Diff Detail

Event Timeline

hazohelet created this revision.Aug 29 2023, 2:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 29 2023, 2:08 PM
hazohelet requested review of this revision.Aug 29 2023, 2:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 29 2023, 2:08 PM
clang/lib/Sema/SemaChecking.cpp
966

Then couldn't we increase the Size if the FieldWidth is unspecified?

hazohelet marked an inline comment as done.Aug 30 2023, 2:31 AM
hazohelet added inline comments.
clang/lib/Sema/SemaChecking.cpp
966

We cannot, unless the formatted integer is known to be nonzero.
("%#x", 0) is "0" and not "0x0"

Mind adding tests for %#X and %#o as well, since you're changing the behavior of those, too?

I think it would also be good to have explicit tests for %o and %b where the value being printed was and was not 0. IIUC, we cannot diagnose those (unless a literal value is being printed)?

(Thanks for finding that one of the kernel cases was a false positive!)

hazohelet updated this revision to Diff 555043.Aug 31 2023, 7:38 AM
hazohelet marked an inline comment as done.

Added %X and %o tests, which are also affected

Mind adding tests for %#X and %#o as well, since you're changing the behavior of those, too?

I think it would also be good to have explicit tests for %o and %b where the value being printed was and was not 0. IIUC, we cannot diagnose those (unless a literal value is being printed)?

(Thanks for finding that one of the kernel cases was a false positive!)

The current implementation only sees the format string and calculates the minimum possible length without taking into accout the actual value passed to it. So currently every case assumes that the value can be 0, even when a literal is passed.
Even if the printed expression is not literal, we can use something like Expr::EvaluateAsRValue to learn about the printed value.

As for the %b, it's C23 feature and Wfortify-source doesn't seem to support it yet (%b is considered a valid specifier, but does not increase Size here). I think %b tests should be added when we support it.

clang/test/Sema/warn-fortify-source.c
100–102

Note that GCC -Wformat-truncation can warn on some of these.

https://godbolt.org/z/jE3axWe1W

Looks like the diagnostic keeps an up and lower bounds on the estimated format string expansion.

Trunk for Clang also warns for these, so is this change a regression? Or are both GCC and Clang (trunk) incorrect?

121–123

Clang @ trunk and GCC both warn for these 3. Regression?
https://godbolt.org/z/j551hTE5E

hazohelet added inline comments.Aug 31 2023, 6:57 PM
clang/test/Sema/warn-fortify-source.c
100–102

Clang trunk is saying something factually incorrect because it says the output will always be truncated, when in fact __builtin_snprintf(buf, 2, "%#x", n); doesn't trigger truncation if n is zero.

GCC is correct but is more conservative than clang's ALWAYS be truncated diagnostics.
GCC's warning message is ... directive output MAY BE truncated .
GCC doesn't warn on it when n is known to be zero. (https://godbolt.org/z/E51a3Pfhr)

GCC's behavior makes sense here because the truncation does happen whenever n is not zero. If the user knows n is zero then they have no reason to use %#x specifier.
So, I think it makes sense to assume n is not zero and emit diagnostics, but it definitely needs diagnostics rewording like is likely to be truncated.

clang/docs/ReleaseNotes.rst
125

??

139

looks like a leak from another patch ^^!

nickdesaulniers accepted this revision.Sep 8 2023, 11:44 AM

Thanks for the patch!

clang/test/Sema/warn-fortify-source.c
100–102

I see; thanks for the explanation. Sorry for the code review delay; I missed this comment in my inbox.

This revision is now accepted and ready to land.Sep 8 2023, 11:44 AM
hazohelet marked 4 inline comments as done.Sep 9 2023, 1:24 AM

Thanks for the review!

clang/docs/ReleaseNotes.rst
125

I'm not seeing any diff around here, so perhaps you were seeing some Phabricator bug?

clang/test/Sema/warn-fortify-source.c
100–102

As a future plan, I think it's generally good to follow GCC's behavior regarding this format warning considering GCC's high smartness on it.
I want to emit some warning here like GCC does because users are highly likely to expect non-zero value to be printed, but I think it's reasonable to defer it for now because it's not ideal to add a new warning flag when we are considering changing this warning's flag from Wfortify-source to Wformat-truncation

This revision was landed with ongoing or failed builds.Sep 11 2023, 8:23 PM
This revision was automatically updated to reflect the committed changes.
hazohelet marked an inline comment as done.