This is an archive of the discontinued LLVM Phabricator instance.

[NFC] Make format() more amenable to format attributes
ClosedPublic

Authored by fcloutier on Oct 26 2022, 5:35 PM.

Details

Reviewers
ahatanak
Summary

Second take on https://reviews.llvm.org/rGfb1e90ef07fec0d64a05c0b6d41117a5ea3e8344, which was pulled out because it broke GCC-based builds. The offending code was:

constexpr uint64_t specifierBit(char C) { return 1 << (C - 0x40); }

The issue is that the shift expression has type int, but the shift value has range 0..<0x40. Shifting (int)1 by more than 30 is undefined behavior.

With GCC, this manifested with test failures, and when sanitizers were enabled, with sanitizer traps. However, because the function is constexpr, Clang chose to evaluate it at compile-time even when when its execution was undefined. (Both Clang and GCC reject calls to specifierBit and its specifierMask caller when using constinit.) Unfortunately, somehow, the result was good enough to pass all tests when built with clang; and since the undefined behavior was evaluated at build time, UBSan wouldn't catch it.

I have reproduced the problem with a GCC build of LLVM and verified that this change fixes it:

constexpr uint64_t specifierBit(char C) { return (uint64_t)1 << (C - 0x40); }

Diff Detail

Event Timeline

fcloutier created this revision.Oct 26 2022, 5:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 5:35 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
fcloutier requested review of this revision.Oct 26 2022, 5:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 5:35 PM
ahatanak accepted this revision.Oct 26 2022, 6:03 PM

Probably not NFC according to https://llvm.org/docs/Lexicon.html, but LGTM.

llvm/lib/Support/Format.cpp
60

Can you add an explanation for what the number 0x40 means or why it was chosen?

This revision is now accepted and ready to land.Oct 26 2022, 6:03 PM
fcloutier updated this revision to Diff 470987.Oct 26 2022, 6:51 PM

Add comment explaining what specifierBit does.

since the undefined behavior was evaluated at build time, UBSan wouldn't catch it.

Can you file a bug for this? Opportunistic constant evaluation like that shouldn't bypass ubsan.

fcloutier updated this revision to Diff 472168.Oct 31 2022, 3:55 PM

Fix failing test when parsing "%" as a format string.

since the undefined behavior was evaluated at build time, UBSan wouldn't catch it.

Can you file a bug for this? Opportunistic constant evaluation like that shouldn't bypass ubsan.

I'm not able to reproduce the problem and my setup was janky, so I must have jumped to conclusions too fast after a tangential Godbolt test shows that Clang will evaluate some UB at compile-time (with sanitizers disabled) where GCC won't. This example is probably happening in the back-end instead of the front-end. Sorry for the bad signal.

fcloutier updated this revision to Diff 472353.Nov 1 2022, 10:50 AM

Fix whitespace difference that clang-format didn't like.

since the undefined behavior was evaluated at build time, UBSan wouldn't catch it.

Can you file a bug for this? Opportunistic constant evaluation like that shouldn't bypass ubsan.

I'm not able to reproduce the problem and my setup was janky, so I must have jumped to conclusions too fast after a tangential Godbolt test shows that Clang will evaluate some UB at compile-time (with sanitizers disabled) where GCC won't. This example is probably happening in the back-end instead of the front-end. Sorry for the bad signal.

I managed to come up with a case similar to what you described: https://github.com/llvm/llvm-project/issues/58735 . I think whether the issue triggers depends on how the result of the call is used.

fcloutier closed this revision.Nov 2 2022, 2:00 PM

Merged to mainline; sadly forgot to update the commit message to include the differential ID, so manually closing. I'll see if I can find existing git hooks that check for that in the future.

https://reviews.llvm.org/rGcf239c2f1777