This is an archive of the discontinued LLVM Phabricator instance.

Add a cmake flag to turn `llvm_unreachable()` into builtin_trap() when assertions are disabled
ClosedPublic

Authored by mehdi_amini on Mar 15 2022, 3:46 PM.

Diff Detail

Event Timeline

mehdi_amini created this revision.Mar 15 2022, 3:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2022, 3:46 PM
mehdi_amini requested review of this revision.Mar 15 2022, 3:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2022, 3:46 PM

I like this idea.

Not sure if LLVM_UNREACHABLE_SHOULD_TRAP is quite right... would LLVM_UNREACHABLE_ALWAYS_TRAP be better?
(I'm honestly not sure. "should" seems a bit off to me, since it doesn't sound like a guarantee.)

Or: LLVM_UNREACHABLE_OPTIMIZE=OFF?
(Reasoning: it's disabling the optimization opportunity in no-asserts builds but leaving behind the debugging feature in asserts builds.)

llvm/include/llvm/Support/ErrorHandling.h
126–127

I think this part of the comment should be updated.

mehdi_amini marked an inline comment as done.

Address Duncan's comment: rename the flag to LLVM_UNREACHABLE_OPTIMIZE and update the comment

Or: LLVM_UNREACHABLE_OPTIMIZE=OFF?
(Reasoning: it's disabling the optimization opportunity in no-asserts builds but leaving behind the debugging feature in asserts builds.)

I like the naming! I had the same reasoning but didn't find a good name :)

llvm/include/llvm/Support/ErrorHandling.h
126–127

Done! PTAL

dexonsmith accepted this revision.Mar 16 2022, 8:03 PM

It feels to me like this is worth documenting in llvm/docs/CMake.rst, maybe next to LLVM_ABI_BREAKING_CHECKS since they both seem assertion related, so that it's more easily discovered.

Otherwise, LGTM, with more wording suggestion inline.

llvm/CMakeLists.txt
468

I wonder if the wording should be tied to the word "optimize". Also, note that trapping is a legit implementation of undefined behaviour; so another interpretation is that llvm_unreachable() is being treated as UB either way.

E.g.:

A "reached" llvm_unreachable() is undefined behaviour; ON optimizes the unreachable control flow (default), while OFF triggers a guaranteed trap

WDYT?
(I'm okay with landing the wording as you had it if you prefer it; can always update it later)

This revision is now accepted and ready to land.Mar 16 2022, 8:03 PM
dexonsmith added inline comments.Mar 16 2022, 8:04 PM
llvm/CMakeLists.txt
468

Or maybe this should be short and sweet:

Optimize control flow based on llvm_unreachable(); guaranteed trap when OFF

(and leave the longer explanation for llvm/docs/...)

mehdi_amini added inline comments.Mar 16 2022, 8:43 PM
llvm/CMakeLists.txt
468

Trapping is a legit implementation of UB, but the opposite isn't and this is the important bit I was trying to express.
I see what you mean with "optimize" but it may not be obvious for everyone that we're turning a trap into UB here.

mehdi_amini added inline comments.Mar 16 2022, 8:44 PM
llvm/CMakeLists.txt
468

What about: "Optimize llvm_unreachable() as undefined behavior (default), guaranteed trap when OFF"?

Update CMake.rst and the description of the option in CMake

Fix doc: s/STRING/BOOL/

dexonsmith added inline comments.Mar 17 2022, 12:56 PM
llvm/CMakeLists.txt
468

Sure, sounds fine to me!

llvm/include/llvm/Support/ErrorHandling.h
134–135

I'd prefer one of:

/// * When "OFF", a builtin_trap is emitted instead.
/// * When "OFF", a builtin_trap is emitted instead of enabling
//    optimizations or printing a reduced message.
/// * When "OFF", a builtin_trap is emitted instead of an
//    optimizer hint or printing a reduced message.

But your text here LGTM too; I don't think it needs to be perfect.

mehdi_amini marked an inline comment as done.Mar 17 2022, 3:20 PM
This revision was landed with ongoing or failed builds.Mar 17 2022, 3:21 PM
This revision was automatically updated to reflect the committed changes.
dyung added a subscriber: dyung.Mar 18 2022, 12:25 AM

This seems to be breaking our internal Release Windows build with errors of the sort:

7>C:\src\upstream\llvm_clean_git\llvm\lib\Support\Compression.cpp(104): error C4716: 'llvm::zlib::uncompress': must return a value
7>C:\src\upstream\llvm_clean_git\llvm\lib\Support\Compression.cpp(107): error C4716: 'llvm::zlib::crc32': must return a value
7>C:\src\upstream\llvm_clean_git\llvm\lib\Support\Compression.cpp(99): error C4716: 'llvm::zlib::uncompress': must return a value
...
5>C:\src\upstream\llvm_clean_git\clang\utils\TableGen\MveEmitter.cpp(480): error C4716: '`anonymous namespace'::Result::getIntegerValue': must return a value
...
3>C:\src\upstream\llvm_clean_git\llvm\utils\TableGen\GlobalISelEmitter.cpp(1175): error C4716: '`anonymous namespace'::PredicateMatcher::getValue': must return a value
3>C:\src\upstream\llvm_clean_git\llvm\utils\TableGen\GlobalISelEmitter.cpp(803): error C4716: '`anonymous namespace'::SwitchMatcher::getFirstCondition': must return a value
3>C:\src\upstream\llvm_clean_git\llvm\utils\TableGen\GlobalISelEmitter.cpp(800): error C4716: '`anonymous namespace'::SwitchMatcher::popFirstCondition': must return a value

The cmake command I ran for this build was:

cmake.exe -G "Visual Studio 16 2019" -DCLANG_ENABLE_ARCMT=OFF -DCMAKE_BUILD_TYPES=Release -DLLVM_BUILD_RUNTIME=OFF -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-ps4-scei -DLLVM_ENABLE_TIMESTAMPS=OFF -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_TOOL_LLD_BUILD=OFF -DLLVM_VERSION_SUFFIX= -DLLVM_LIT_ARGS="--verbose -j24" -Thost=x64 -DLLVM_ENABLE_PROJECTS=compiler-rt;clang;clang-tools-extra C:\src\upstream\llvm_clean_git\llvm

Can you take a look?

dexonsmith added inline comments.Mar 18 2022, 11:17 AM
llvm/include/llvm/Support/ErrorHandling.h
143

Based on:

7>C:\src\upstream\llvm_clean_git\llvm\lib\Support\Compression.cpp(104): error C4716: 'llvm::zlib::uncompress': must return a value

it looks like LLVM_BUILTIN_TRAP isn't guaranteed to be something the compiler understands as noreturn. Maybe this logic needs to fallback to llvm_unreachable_internal if LLVM_BUILTIN_TRAP is not a function.

mehdi_amini added inline comments.Mar 18 2022, 11:20 AM
llvm/include/llvm/Support/ErrorHandling.h
143

The bug is that when you suggested a renaming the macro I didn't inverse the condition: the previous name was something like SHOULD_TRAP but with the new name the logic is reversed! I need #elif !LLVM_UNREACHABLE_OPTIMIZE I think.

mehdi_amini reopened this revision.Mar 18 2022, 12:21 PM
This revision is now accepted and ready to land.Mar 18 2022, 12:21 PM

Fix the condition

llvm/include/llvm/Support/ErrorHandling.h
143

I fixed the condition but I'm not sure how to address the windows situation. I think I'd try to add an attribute or something to make it "noreturn" there as well, but rather leave it to someone who could actually test it.

This revision was landed with ongoing or failed builds.Mar 18 2022, 12:24 PM
This revision was automatically updated to reflect the committed changes.

By the way: it seems that there is no bot upstream building on Windows without assertions, I didn't get an email about the initial breakage!

dexonsmith added inline comments.Mar 18 2022, 2:32 PM
llvm/include/llvm/Support/ErrorHandling.h
143

Hah, that's what I thought at first, and then I convinced myself that it wasn't wrong and jumped to the other conclusion :).

For the windows thing, maybe don't rely on LLVM_BUILTIN_TRAP for now, if it's not guaranteed to be noreturn. Could change the builtin trap logic to export another variable:

#if __has_builtin(__builtin_trap) || defined(__GNUC__)
# define LLVM_BUILTIN_TRAP __builtin_trap()
# define LLVM_BUILTIN_TRAP_NO_RETURN 1
#elif defined(_MSC_VER)
// The __debugbreak intrinsic is supported by MSVC, does not require forward
// declarations involving platform-specific typedefs (unlike RaiseException),
// results in a call to vectored exception handlers, and encodes to a short
// instruction that still causes the trapping behavior we want.
# define LLVM_BUILTIN_TRAP __debugbreak()
# define LLVM_BUILTIN_TRAP_NO_RETURN 0
#else
# define LLVM_BUILTIN_TRAP *(volatile int*)0x11 = 0
# define LLVM_BUILTIN_TRAP_NO_RETURN 0
#endif

Assuming something similar creates a variable LLVM_BUILTIN_UNREACHABLE_NON_EMPTY (to fix the bug mentioned in the other inline comment):

#ifndef NDEBUG
#define llvm_unreachable(msg) \
  ::llvm::llvm_unreachable_internal(msg, __FILE__, __LINE__)
#elif LLVM_UNREACHABLE_OPTIMIZE && LLVM_BUILTIN_UNREACHABLE_NON_EMPTY
#define llvm_unreachable(msg) LLVM_BUILTIN_UNREACHABLE
#elif LLVM_BUILTIN_TRAP_NO_RETURN
#define llvm_unreachable(msg) LLVM_BUILTIN_TRAP
#else
#define llvm_unreachable(msg) ::llvm::llvm_unreachable_internal()
#endif
144

Just noticed that this pre-existing logic was wrong.

LLVM_BUILTIN_UNREACHABLE is defined whenever llvm/Support/Compiler.h is included:

#if __has_builtin(__builtin_unreachable) || defined(__GNUC__)
# define LLVM_BUILTIN_UNREACHABLE __builtin_unreachable()
#elif defined(_MSC_VER)
# define LLVM_BUILTIN_UNREACHABLE __assume(false)
#else
# define LLVM_BUILTIN_UNREACHABLE
#endif

That probably didn't used to be true.

I think the intent of this code is to check whether LLVM_BUILTIN_UNREACHABLE is empty. If the Compiler.h logic exports another variable that can be used for detecting that, this could become:

#ifndef NDEBUG
#define llvm_unreachable(msg) \
  ::llvm::llvm_unreachable_internal(msg, __FILE__, __LINE__)
#elif LLVM_BUILTIN_UNREACHABLE_NON_EMPTY
#define llvm_unreachable(msg) LLVM_BUILTIN_UNREACHABLE
#else
#define llvm_unreachable(msg) ::llvm::llvm_unreachable_internal()
#endif

(before your patch)

mehdi_amini added inline comments.Mar 19 2022, 3:37 PM
llvm/include/llvm/Support/ErrorHandling.h
143

Right, that would be a valid path. But what about another path of making LLVM_BUILTIN_TRAP correctly "noreturn" on Windows as well? I'm not a Windows user but someone more versed into Windows may be able to help on this?

144

I wonder if just:

#ifndef NDEBUG
#define llvm_unreachable(msg) \
  ::llvm::llvm_unreachable_internal(msg, __FILE__, __LINE__)
#elif!LLVM_UNREACHABLE_OPTIMIZE
#define llvm_unreachable(msg) LLVM_BUILTIN_UNREACHABLE
#else
#define llvm_unreachable(msg) LLVM_BUILTIN_TRAP
#endif

Wouldn't be enough? Yes it would accounts to "nothing" when LLVM_BUILTIN_UNREACHABLE is empty, but that seems just correct and maybe even desirable: when LLVM_UNREACHABLE_OPTIMIZE is enabled "do nothing"
seems closer to the behavior we're looking for rather than calling llvm_unreachable_internal() right?

dexonsmith added inline comments.
llvm/include/llvm/Support/ErrorHandling.h
143

Maybe the windows case could be fixed with another intrinsic, other than __debugbreak. Note that __debugbreak() can return. From https://docs.microsoft.com/en-us/cpp/intrinsics/debugbreak?view=msvc-170 :

Causes a breakpoint in your code, where the user will be prompted to run the debugger.

Maybe @rnk has an idea for something else.

Even with that fixed, is there a way to mark the fall-through case as "no-return"?

# define LLVM_BUILTIN_TRAP *(volatile int*)0x11 = 0

That said, I'm not sure if making builtin_trap() no-return needs to be a blocker for this patch.

144

Did you mean:

#ifndef NDEBUG
#define llvm_unreachable(msg) \
  ::llvm::llvm_unreachable_internal(msg, __FILE__, __LINE__)
#elif LLVM_UNREACHABLE_OPTIMIZE
#define llvm_unreachable(msg) LLVM_BUILTIN_UNREACHABLE
#else
#define llvm_unreachable(msg) LLVM_BUILTIN_TRAP
#endif

? (Or maybe I'm reading it backwards)

Regardless, one of the important pieces of llvm_unreachable() is that it suppresses diagnostics about the control path it's on. If it expands to "empty"/nothing then it won't suppress diagnostics.

I think the behaviour here changed accidentally. @aeubanks updated LLVM_BUILTIN_UNREACHABLE as a drive-by in 26827337dff26ba3450721f880d4c6caaf2a8219 / https://reviews.llvm.org/D111581:

-#if __has_builtin(__builtin_unreachable) || LLVM_GNUC_PREREQ(4, 5, 0)
+#if __has_builtin(__builtin_unreachable) || defined(__GNUC__)
 # define LLVM_BUILTIN_UNREACHABLE __builtin_unreachable()
 #elif defined(_MSC_VER)
 # define LLVM_BUILTIN_UNREACHABLE __assume(false)
+#else
+# define LLVM_BUILTIN_UNREACHABLE
 #endif
Add a missing #endif for LLVM_ATTRIBUTE_UNREACHABLE.

Before that, the fallback behaviour was to hit the #else case here so that llvm_unreachable() always expanded to something that was no-return.

Maybe we should revert that part of 26827337dff26ba3450721f880d4c6caaf2a8219 patch before landing this.

dexonsmith added inline comments.Mar 21 2022, 11:59 AM
llvm/include/llvm/Support/ErrorHandling.h
144

I suggest:

#ifndef NDEBUG
#define llvm_unreachable(msg) \
  ::llvm::llvm_unreachable_internal(msg, __FILE__, __LINE__)
#elif !defined(LLVM_BUILTIN_UNREACHABLE)
#define llvm_unreachable(msg) ::llvm::llvm_unreachable_internal()
#elif LLVM_UNREACHABLE_OPTIMIZE
#define llvm_unreachable(msg) LLVM_BUILTIN_UNREACHABLE
#else
#define llvm_unreachable(msg) LLVM_BUILTIN_TRAP, LLVM_BUILTIN_UNREACHABLE
#endif

I *think* this will have all the intended effects:

  • If asserts are off, use the function (like a fancy assertion).
  • Else, if there is no LLVM_BUILTIN_UNREACHABLE, call the function without args (works again as of https://reviews.llvm.org/D122167).
  • Else, always call LLVM_BUILTIN_UNREACHABLE to suppress warnings about this control flow path...
    • ... but if !LLVM_UNREACHABLE_OPTIMIZE, first call LLVM_BUILTIN_TRAP to block optimizations.
    • ... and if LLVM_BUILTIN_TRAP happens to be marked no-return, the extra LLVM_BUILTIN_UNREACHABLE should get cleaned up by the optimizer.

I suggest:

#ifndef NDEBUG
#define llvm_unreachable(msg) \
  ::llvm::llvm_unreachable_internal(msg, __FILE__, __LINE__)
#elif !defined(LLVM_BUILTIN_UNREACHABLE)
#define llvm_unreachable(msg) ::llvm::llvm_unreachable_internal()
#elif LLVM_UNREACHABLE_OPTIMIZE
#define llvm_unreachable(msg) LLVM_BUILTIN_UNREACHABLE
#else
#define llvm_unreachable(msg) LLVM_BUILTIN_TRAP, LLVM_BUILTIN_UNREACHABLE
#endif

I *think* this will have all the intended effects:

  • If asserts are off, use the function (like a fancy assertion).
  • Else, if there is no LLVM_BUILTIN_UNREACHABLE, call the function without args (works again as of https://reviews.llvm.org/D122167).
  • Else, always call LLVM_BUILTIN_UNREACHABLE to suppress warnings about this control flow path...
    • ... but if !LLVM_UNREACHABLE_OPTIMIZE, first call LLVM_BUILTIN_TRAP to block optimizations.
    • ... and if LLVM_BUILTIN_TRAP happens to be marked no-return, the extra LLVM_BUILTIN_UNREACHABLE should get cleaned up by the optimizer.

LG: https://reviews.llvm.org/D122170

llvm/include/llvm/Support/ErrorHandling.h
144

Regardless, one of the important pieces of llvm_unreachable() is that it suppresses diagnostics about the control path it's on. If it expands to "empty"/nothing then it won't suppress diagnostics.

Ah I forgot about this. That said: when you encounter a toolchain that does not support a proper "unreachable", that would just mean they don't have a way to suppress the diagnostics and have to live with warnings. For the *(volatile int*)0x11 = 0, you can wrap it in a macro with a noreturn attribute: []() __attribute__((__noreturn__)) { *(volatile int*)0x11 = 0; }(); (the same may apply to Windows debugbreak() builtin by the way).

Anyway I think your solution will work just fine!