This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
docs/
-
ReleaseNotes.rst
-
include/clang/Basic/
-
clang/
-
Basic/
-
DiagnosticFrontendKinds.td
-
DiagnosticGroups.td
-
lib/CodeGen/
-
CodeGen/
-
CodeGenAction.cpp
-
test/Frontend/
-
Frontend/
-
backend-function-fallthrough.c
-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
-
DiagnosticInfo.h
-
lib/
-
CodeGen/AsmPrinter/
-
AsmPrinter/
-
AsmPrinter.cpp
-
IR/
-
DiagnosticInfo.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
1
function-fallthrough.ll

Differential D146466

[clang] diagnose function fallthrough
Needs ReviewPublic

Authored by nickdesaulniers on Mar 20 2023, 4:01 PM.

Download Raw Diff

Details

Reviewers

efriedma
aaron.ballman
void

Summary

Debugging functions that have fallen through to the next function is not
fun. So much so that the Linux kernel has a host utility called objtool
that can report such cases of fallthrough (for architectures objtool
supports). Example:

drivers/media/i2c/m5mols/m5mols.o: warning: objtool:
m5mols_set_fmt() falls through to next function __cfi_m5mols_open()

With this new diagnostic, -Wfunction-fallthrough, we can produce a
similar diagnostic:

drivers/media/i2c/m5mols/m5mols_core.c:573:12: warning: function
'm5mols_set_fmt' falls through to next function
[-Wfunction-fallthrough]
static int m5mols_set_fmt(struct v4l2_subdev *sd,
           ^

This can help diagnose the improper usage of __builtin_unreachable(). It
can also help us spot potential codegen bugs in LLVM itself.

Example issues faced in the past:
https://github.com/ClangBuiltLinux/linux/issues?q=is%3Aissue+falls+label%3A%22%5BTOOL%5D+objtool%22+
Feature request:
https://lore.kernel.org/llvm/CAHk-=wgTSdKYbmB1JYM5vmHMcD9J9UZr0mn7BOYM_LudrP+Xvw@mail.gmail.com/

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nickdesaulniers created this revision.Mar 20 2023, 4:01 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2023, 4:01 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

nickdesaulniers requested review of this revision.Mar 20 2023, 4:01 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 20 2023, 4:01 PM

Herald added subscribers: llvm-commits, cfe-commits. · View Herald Transcript

nickdesaulniers added child revisions: D146339: [StackProtector] attribute __stack_chk_fail as NoReturn, D146467: [MachineInstr] make getCalledFunction method of MachineInstr.Mar 20 2023, 4:03 PM

git clang-format HEAD~

nickdesaulniers added reviewers: efriedma, aaron.ballman, void.Mar 20 2023, 4:04 PM

nickdesaulniers added inline comments.

llvm/test/CodeGen/AArch64/function-fallthrough.ll
7	TODO: add test that the final MBB is a call to a noreturn function followed by unreachable

Harbormaster completed remote builds in B220571: Diff 506760.Mar 20 2023, 4:17 PM

I'm concerned about the potential for false positives:

If a user doesn't mark a function noreturn, but it doesn't actually ever return.
A function is conditionally broken, but the condition is never actually true (for example, jump threading creates a dead codepath).
A function might actually be unconditionally broken, but it doesn't matter because it never actually gets called. This can easily happen with C++ templates, or an "outlining" optimization on LLVM IR.

For some of these situations, there are somewhat straightforward workarounds we can suggest for users... but in some cases, there aren't. I really don't want to to implement warnings where the best suggestion we can make to fix the warning is "Try refactoring your code; it might go away".

Yeah, we already have the return type warning and UBSan for the dynamic version of this check - seems OK? That a function lowers to unreachable isn't a bug - there's a ton of them in LLVM, quite intentionally as unimplemented functions that should never be called.

If the issue is with the actual fallthrough - that's been discussed elsewhere recently, and I think it'd be reasonable to turn on TrapOnUnreachable everywhere, basically (with NoTrapAfterNoreturn too, perhaps, to keep the cost low) so that these trivial fallthroughs hit int3 instead of going on to the next function.

In D146466#4208216, @dblaikie wrote:

If the issue is with the actual fallthrough - that's been discussed elsewhere recently, and I think it'd be reasonable to turn on TrapOnUnreachable everywhere, basically (with NoTrapAfterNoreturn too, perhaps, to keep the cost low) so that these trivial fallthroughs hit int3 instead of going on to the next function.

As I understand things, folks are confused about the runtime behavior of falling through and magically calling some other function. If we can warn about that situation from the FE without false positives, then huttah, I think that's sufficient. But I don't think we know of a way to do that from the FE, so I think ensuring a runtime trap is more helpful to folks than the status quo. (Also, in general, I think we want to avoid emitting diagnostics from the backend -- that has usability issues where clean -fsyntax-only builds suddenly get warnings when doing full codegen, or in circumstances where the optimization level changes the diagnostic output, etc.)

xbolva00 added a subscriber: xbolva00.Mar 21 2023, 11:46 AM

In D146466#4207915, @efriedma wrote:

I'm concerned about the potential for false positives:

If a user doesn't mark a function noreturn, but it doesn't actually ever return.

Ok, but a user can fix that in their code. Also, I think we could possibly help by adding diagnostics for this. Unconditionally calling a noreturn function or having an obviously infinite loop should trigger such a diagnostic with note encouraging the user to mark their function noreturn as well?

A function is conditionally broken, but the condition is never actually true (for example, jump threading creates a dead codepath).

A function might actually be unconditionally broken, but it doesn't matter because it never actually gets called. This can easily happen with C++ templates, or an "outlining" optimization on LLVM IR.

These sound like bugs in LLVM. Even if we don't expose function fallthrough to users as a diagnostic warning (via -Wfunction-fallthrough), perhaps a -mllvm flag is still handy for llvm developers to spot places where we could perhaps improve codegen.

I don't think we should ever emit a branch to an unconditional block; perhaps we need to be more aggressive about cleaning those up? Otherwise it seems like a waste of instructions, which pollute I$.

For some of these situations, there are somewhat straightforward workarounds we can suggest for users... but in some cases, there aren't. I really don't want to to implement warnings where the best suggestion we can make to fix the warning is "Try refactoring your code; it might go away".

Contrast that with the intent of the feature request; diagnosing when a function may fall through to another. That is quite painful to debug when that occurs at runtime. How do we reconcile?

In D146466#4208216, @dblaikie wrote:

Yeah, we already have the return type warning

Which warning?

and UBSan for the dynamic version of this check - seems OK?

Which ubsan check?

That a function lowers to unreachable isn't a bug - there's a ton of them in LLVM, quite intentionally as unimplemented functions that should never be called.

IDK about an entire function lowering to unreachable; more so that there exists a branch to an unreachable basic block (or simply an unreachable inst) where that results in one function falling through to another.

If the issue is with the actual fallthrough - that's been discussed elsewhere recently,

Link?

and I think it'd be reasonable to turn on TrapOnUnreachable everywhere, basically (with NoTrapAfterNoreturn too, perhaps, to keep the cost low) so that these trivial fallthroughs hit int3 instead of going on to the next function.

Right, at this point, I'm probably just going to turn on -mllvm -trap-unreachable in the Linux kernel. I might create a -f flag for it, since I do my best to avoid using -mllvm in the kernel's build system. I think this is unfortunate because I suspect it will increase the kernel image size and hide codegen bugs in LLVM, particularly when sanitizers are enabled. (i.e. places where we generate branches to unreachable).

In D146466#4210607, @aaron.ballman wrote:

In D146466#4208216, @dblaikie wrote:

If the issue is with the actual fallthrough - that's been discussed elsewhere recently, and I think it'd be reasonable to turn on TrapOnUnreachable everywhere, basically (with NoTrapAfterNoreturn too, perhaps, to keep the cost low) so that these trivial fallthroughs hit int3 instead of going on to the next function.

As I understand things, folks are confused about the runtime behavior of falling through and magically calling some other function. If we can warn about that situation from the FE without false positives, then huttah, I think that's sufficient. But I don't think we know of a way to do that from the FE, so I think ensuring a runtime trap is more helpful to folks than the status quo. (Also, in general, I think we want to avoid emitting diagnostics from the backend -- that has usability issues where clean -fsyntax-only builds suddenly get warnings when doing full codegen, or in circumstances where the optimization level changes the diagnostic output, etc.)

The issue is that there are problems that may occur as a result of optimizations; if the front end does not do optimizations, it cannot diagnose these. Some examples include:

-Wframe-larger-than= (useful for embedded systems without "nearly unlimited" stack, though an imperfect proxy for maximal stack depth, which is what these systems actually want. Helpful for spotting large stack allocations that should come from the heap or static memory).
-Wattribute-warning (some implementations of _FORTIFY_SOURCE such as the Linux kernel's rely on this; specifically that calls to functions with these attributes get optimized out. This provides significant protections during compile time when due to optimizations we have an out of bounds access, when compared to implementations that simply use _Static_assert)
-Wfunction-fallthrough (hypothetical new diagnostic this patch is adding)

Even if Clang had a high level IR and did optimizations, it still could not guarantee the results of LLVM's subsequent optimizations.

Does that mean that -Wframe-larger-than= (and the like) should not exist because it's not diagnosable via -fsyntax-only?

I do think that there are a class of diagnostics that are useful to users that depend on optimizations being run. While a policy around new diagnostics being diagnosable via -fsyntax-only is a simplistic litmus test, I do think it allows for our competition to implement diagnostics that may be more useful to users. This hamstrings our competitiveness, IMO.

In D146466#4213814, @nickdesaulniers wrote:

In D146466#4207915, @efriedma wrote:

I'm concerned about the potential for false positives:

If a user doesn't mark a function noreturn, but it doesn't actually ever return.

Ok, but a user can fix that in their code. Also, I think we could possibly help by adding diagnostics for this. Unconditionally calling a noreturn function or having an obviously infinite loop should trigger such a diagnostic with note encouraging the user to mark their function noreturn as well?

But that's a false positive - there was no bug in their code. We try to avoid those in compile-time diagnostics. Maybe the noreturn function is a library they're calling outside their control.

A function is conditionally broken, but the condition is never actually true (for example, jump threading creates a dead codepath).

A function might actually be unconditionally broken, but it doesn't matter because it never actually gets called. This can easily happen with C++ templates, or an "outlining" optimization on LLVM IR.

These sound like bugs in LLVM. Even if we don't expose function fallthrough to users as a diagnostic warning (via -Wfunction-fallthrough), perhaps a -mllvm flag is still handy for llvm developers to spot places where we could perhaps improve codegen.

I don't understand why these would be bugs in LLVM - the LLVM source code itself has many functions that are intentionally unconditionally broken if you ever call them (they have just llvm_unreachable as their body) - and they are, in practice, never called. These aren't bugs/there isn't anything to do to "fix" this code and a warning/error that flagged them as such would be noise and inconvenience to LLVM developers.

A sanitizer error if they're ever called would be suitable, and does exist - if a developer introduced a bug where one of these got called, it could be diagnosed by -fsanitize=undefined.

I don't think we should ever emit a branch to an unconditional block; perhaps we need to be more aggressive about cleaning those up? Otherwise it seems like a waste of instructions, which pollute I$.

For some of these situations, there are somewhat straightforward workarounds we can suggest for users... but in some cases, there aren't. I really don't want to to implement warnings where the best suggestion we can make to fix the warning is "Try refactoring your code; it might go away".

Contrast that with the intent of the feature request; diagnosing when a function may fall through to another. That is quite painful to debug when that occurs at runtime. How do we reconcile?

Generally the solution LLVM/Clang prefers is compile-time diagnostics that can be implemented reliably/deterministically in the frontend, and sanitizers that fail fast at runtime for ~everything else.

In D146466#4208216, @dblaikie wrote:

Yeah, we already have the return type warning

Which warning?

-Wreturn-type: https://godbolt.org/z/cde5MGeac

and UBSan for the dynamic version of this check - seems OK?

Which ubsan check?

https://godbolt.org/z/n1YGWc9Wb - "runtime error: execution reached the end of a value-returning function without returning a value"

That a function lowers to unreachable isn't a bug - there's a ton of them in LLVM, quite intentionally as unimplemented functions that should never be called.

IDK about an entire function lowering to unreachable; more so that there exists a branch to an unreachable basic block (or simply an unreachable inst) where that results in one function falling through to another.

If the issue is with the actual fallthrough - that's been discussed elsewhere recently,

Link?

https://discourse.llvm.org/t/can-we-keep-must-progress-optimizations-around-infinite-loop-in-c-while-avoiding-some-surprising-behavior/69205/3

and I think it'd be reasonable to turn on TrapOnUnreachable everywhere, basically (with NoTrapAfterNoreturn too, perhaps, to keep the cost low) so that these trivial fallthroughs hit int3 instead of going on to the next function.

Right, at this point, I'm probably just going to turn on -mllvm -trap-unreachable in the Linux kernel. I might create a -f flag for it, since I do my best to avoid using -mllvm in the kernel's build system. I think this is unfortunate because I suspect it will increase the kernel image size and hide codegen bugs in LLVM, particularly when sanitizers are enabled. (i.e. places where we generate branches to unreachable).

What sort of codegen bugs would this hide? What's the problematic case with sanitizers you're picturing? I don't quite understand it - perhaps a godbolt link/example?

If this does increase size at all - beyond the necessary cases (adding an int3 on an empty and never-called function or a never taken branch that would've fallen through) those are bugs we could fix.

That's partly what I wanted to do when I started looking at enabling TrapOnUnreachable in general - trying to get to the minimal number of traps necessary, so code with an infinite-but-side-effecting (so valid C++/IR) loop shouldn't need a trap after it, and nothing branches to the unreachable so it's fine to omit any trap there. But I never could figure out how to do that nicely (basically is the last instruction before the end where the trap might go, an unconditional jump? Then no need for a trap, probably... (I mean, someone else might be jumping into this tail, so it's a bit trickier than just that, and even just that wasn't obvious to me))

In D146466#4210607, @aaron.ballman wrote:

In D146466#4208216, @dblaikie wrote:

If the issue is with the actual fallthrough - that's been discussed elsewhere recently, and I think it'd be reasonable to turn on TrapOnUnreachable everywhere, basically (with NoTrapAfterNoreturn too, perhaps, to keep the cost low) so that these trivial fallthroughs hit int3 instead of going on to the next function.

As I understand things, folks are confused about the runtime behavior of falling through and magically calling some other function. If we can warn about that situation from the FE without false positives, then huttah, I think that's sufficient. But I don't think we know of a way to do that from the FE, so I think ensuring a runtime trap is more helpful to folks than the status quo. (Also, in general, I think we want to avoid emitting diagnostics from the backend -- that has usability issues where clean -fsyntax-only builds suddenly get warnings when doing full codegen, or in circumstances where the optimization level changes the diagnostic output, etc.)

The issue is that there are problems that may occur as a result of optimizations; if the front end does not do optimizations, it cannot diagnose these. Some examples include:

-Wframe-larger-than= (useful for embedded systems without "nearly unlimited" stack, though an imperfect proxy for maximal stack depth, which is what these systems actually want. Helpful for spotting large stack allocations that should come from the heap or static memory).

-Wattribute-warning (some implementations of _FORTIFY_SOURCE such as the Linux kernel's rely on this; specifically that calls to functions with these attributes get optimized out. This provides significant protections during compile time when due to optimizations we have an out of bounds access, when compared to implementations that simply use _Static_assert)

-Wfunction-fallthrough (hypothetical new diagnostic this patch is adding)

Even if Clang had a high level IR and did optimizations, it still could not guarantee the results of LLVM's subsequent optimizations.

Does that mean that -Wframe-larger-than= (and the like) should not exist because it's not diagnosable via -fsyntax-only?

I do think that there are a class of diagnostics that are useful to users that depend on optimizations being run. While a policy around new diagnostics being diagnosable via -fsyntax-only is a simplistic litmus test, I do think it allows for our competition to implement diagnostics that may be more useful to users. This hamstrings our competitiveness, IMO.

Different tools for different folks - I don't think it's outright unacceptable to have optimizer-driven diagnostics, but I think we're understandably cautious of them.

I wouldn't totally mind if frame-large-than were a sanitizer failure only if/when the function is called. Doesn't seem so bad. But it's not the worst thing as it is today either.

In D146466#4214148, @dblaikie wrote:

In D146466#4213814, @nickdesaulniers wrote:

In D146466#4207915, @efriedma wrote:

I'm concerned about the potential for false positives:

If a user doesn't mark a function noreturn, but it doesn't actually ever return.

Ok, but a user can fix that in their code. Also, I think we could possibly help by adding diagnostics for this. Unconditionally calling a noreturn function or having an obviously infinite loop should trigger such a diagnostic with note encouraging the user to mark their function noreturn as well?

But that's a false positive - there was no bug in their code. We try to avoid those in compile-time diagnostics. Maybe the noreturn function is a library they're calling outside their control.

Can you or @efriedma demonstrate a concrete case you're thinking of? I've tried:

void x (void) {
    for (;;);
}
int y (int x) {
    if (x)
        __builtin_unreachable();
    return 42;
}
__attribute__((noreturn))
extern void z (void);

void w (void) { z(); }

with this patch and none of the above produce such a false positive with my patch applied at any optimization level. (i.e. no -Wfunction-fallthrough produced; you'll notice I explicitly check for calls to noreturn functions). So I'd like to make sure we're not arguing against a strawman here. I'm happy to add more test cases to this patch to assure reviewers there are no false positives.

A function is conditionally broken, but the condition is never actually true (for example, jump threading creates a dead codepath).

A function might actually be unconditionally broken, but it doesn't matter because it never actually gets called. This can easily happen with C++ templates, or an "outlining" optimization on LLVM IR.

These sound like bugs in LLVM. Even if we don't expose function fallthrough to users as a diagnostic warning (via -Wfunction-fallthrough), perhaps a -mllvm flag is still handy for llvm developers to spot places where we could perhaps improve codegen.

I don't understand why these would be bugs in LLVM - the LLVM source code itself has many functions that are intentionally unconditionally broken if you ever call them (they have just llvm_unreachable as their body) - and they are, in practice, never called. These aren't bugs/there isn't anything to do to "fix" this code and a warning/error that flagged them as such would be noise and inconvenience to LLVM developers.

I'm not sure why you're referring to calls. This diagnostic is strictly related to intra-function control flow that results in branch to unreachable, which generally is lowered to a branch off the end of a function. I would think that would be considered unacceptable.

A sanitizer error if they're ever called would be suitable, and does exist - if a developer introduced a bug where one of these got called, it could be diagnosed by -fsanitize=undefined.
Generally the solution LLVM/Clang prefers is compile-time diagnostics that can be implemented reliably/deterministically in the frontend, and sanitizers that fail fast at runtime for ~everything else.

Yes, but kernel developers would prefer to diagnose issues at compile time when possible, rather that at runtime. Function fallthough is diagnosable at compile time, as this patch demonstrates.

In D146466#4208216, @dblaikie wrote:

Yeah, we already have the return type warning

Which warning?

-Wreturn-type: https://godbolt.org/z/cde5MGeac

Strange; we get different behavior between C and C++ here in clang wrt. whether a ret is inserted or not. But for C, I think it demonstrates that that diagnostic is not enough to diagnose function fallthrough; try adding __builtin_unreachable() and notice how -Wreturn-type is of no help.

and UBSan for the dynamic version of this check - seems OK?

Which ubsan check?

https://godbolt.org/z/n1YGWc9Wb - "runtime error: execution reached the end of a value-returning function without returning a value"

(Strange again; we get different behavior here between C and C++ in clang)

That said, it would be nice to diagnose function fallthrough at compile time rather than having to have a distinct UBSan build catch this at runtime.

That a function lowers to unreachable isn't a bug - there's a ton of them in LLVM, quite intentionally as unimplemented functions that should never be called.

IDK about an entire function lowering to unreachable; more so that there exists a branch to an unreachable basic block (or simply an unreachable inst) where that results in one function falling through to another.

If the issue is with the actual fallthrough - that's been discussed elsewhere recently,

Link?

https://discourse.llvm.org/t/can-we-keep-must-progress-optimizations-around-infinite-loop-in-c-while-avoiding-some-surprising-behavior/69205/3

Yes, this seems similar. Thanks for the link; I'll watch that thread and think about whether there's common solutions to both problems here.

and I think it'd be reasonable to turn on TrapOnUnreachable everywhere, basically (with NoTrapAfterNoreturn too, perhaps, to keep the cost low) so that these trivial fallthroughs hit int3 instead of going on to the next function.

Right, at this point, I'm probably just going to turn on -mllvm -trap-unreachable in the Linux kernel. I might create a -f flag for it, since I do my best to avoid using -mllvm in the kernel's build system. I think this is unfortunate because I suspect it will increase the kernel image size and hide codegen bugs in LLVM, particularly when sanitizers are enabled. (i.e. places where we generate branches to unreachable).

What sort of codegen bugs would this hide? What's the problematic case with sanitizers you're picturing? I don't quite understand it - perhaps a godbolt link/example?

Here's a reduced test case from a recent kernel incident:
https://godbolt.org/z/q8TWrjMf6
Note how __ubsan_handle_out_of_bounds is literally marked RECOVERABLE in https://github.com/llvm/llvm-project/blob/fb8d894f23c5e805f0c87d89fb9d6c0eed3a0e72/compiler-rt/lib/ubsan/ubsan_handlers.h#L90-L91, and yet we generate a call to it that will fall off the end of the function when that callback returns.

Note: that test case has an off by one bug in the do while loop that was fixed; but it's additionally problematic that LLVM generated code that was not "recoverable."
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=efbcbb12ee99f750c9f25c873b55ad774871de2a

If this does increase size at all - beyond the necessary cases (adding an int3 on an empty and never-called function or a never taken branch that would've fallen through) those are bugs we could fix.

Quick stats for a kernel build (x86 defconfig):
Default:
85635360 B == ~81.66 MB
Default+-mllvm -trap-unreachable:
85826000 B == ~81.85 MB
(so a 0.36% increase in binary size)

That's partly what I wanted to do when I started looking at enabling TrapOnUnreachable in general - trying to get to the minimal number of traps necessary, so code with an infinite-but-side-effecting (so valid C++/IR) loop shouldn't need a trap after it, and nothing branches to the unreachable so it's fine to omit any trap there. But I never could figure out how to do that nicely (basically is the last instruction before the end where the trap might go, an unconditional jump? Then no need for a trap, probably... (I mean, someone else might be jumping into this tail, so it's a bit trickier than just that, and even just that wasn't obvious to me))

As you note, it is interesting to see TargetOptions::TrapUnreachable enabled by default for specific targets. It is probably worthwhile pursuing getting it enabled for more targets. Were there patches to that effect?

In D146466#4210607, @aaron.ballman wrote:

In D146466#4208216, @dblaikie wrote:

If the issue is with the actual fallthrough - that's been discussed elsewhere recently, and I think it'd be reasonable to turn on TrapOnUnreachable everywhere, basically (with NoTrapAfterNoreturn too, perhaps, to keep the cost low) so that these trivial fallthroughs hit int3 instead of going on to the next function.

As I understand things, folks are confused about the runtime behavior of falling through and magically calling some other function. If we can warn about that situation from the FE without false positives, then huttah, I think that's sufficient. But I don't think we know of a way to do that from the FE, so I think ensuring a runtime trap is more helpful to folks than the status quo. (Also, in general, I think we want to avoid emitting diagnostics from the backend -- that has usability issues where clean -fsyntax-only builds suddenly get warnings when doing full codegen, or in circumstances where the optimization level changes the diagnostic output, etc.)

The issue is that there are problems that may occur as a result of optimizations; if the front end does not do optimizations, it cannot diagnose these. Some examples include:

-Wframe-larger-than= (useful for embedded systems without "nearly unlimited" stack, though an imperfect proxy for maximal stack depth, which is what these systems actually want. Helpful for spotting large stack allocations that should come from the heap or static memory).

-Wattribute-warning (some implementations of _FORTIFY_SOURCE such as the Linux kernel's rely on this; specifically that calls to functions with these attributes get optimized out. This provides significant protections during compile time when due to optimizations we have an out of bounds access, when compared to implementations that simply use _Static_assert)

-Wfunction-fallthrough (hypothetical new diagnostic this patch is adding)

Even if Clang had a high level IR and did optimizations, it still could not guarantee the results of LLVM's subsequent optimizations.

Does that mean that -Wframe-larger-than= (and the like) should not exist because it's not diagnosable via -fsyntax-only?

I do think that there are a class of diagnostics that are useful to users that depend on optimizations being run. While a policy around new diagnostics being diagnosable via -fsyntax-only is a simplistic litmus test, I do think it allows for our competition to implement diagnostics that may be more useful to users. This hamstrings our competitiveness, IMO.

Different tools for different folks - I don't think it's outright unacceptable to have optimizer-driven diagnostics, but I think we're understandably cautious of them.

I wouldn't totally mind if frame-large-than were a sanitizer failure only if/when the function is called. Doesn't seem so bad. But it's not the worst thing as it is today either.

Honestly, I should probably write a new diagnostic for a single object being stack allocated if it's greater than a given threshold; that's probably more useful for what the kernel is looking for; otherwise -Wframe-larger-than= can be death by a thousand papercuts due to inlining decisions.

Making -Wframe-larger-than into a sanitizer is perhaps interesting; it would probably require careful interaction with the runtime to know what the "current stack depth" was at all times. Though, "we prefer to diagnose things at compile time, when possible." @kees might know if the kernel has any protection to detect a potential stack overflow at runtime, or what the requirements for some such thing might look like.

Yes, but kernel developers would prefer to diagnose issues at compile time when possible, rather that at runtime. Function fallthough is diagnosable at compile time, as this patch demonstrates.

I'm not saying the diagnostic is useless. I'm saying users are going to find cases where their perfectly good code blows up, and so the warning will get disabled for every large codebase except the Linux kernel. And every user that goes through the process of seeing the false positive and disabling the warning will start distrusting other warnings that are actually reliable.

Note how __ubsan_handle_out_of_bounds is literally marked RECOVERABLE in https://github.com/llvm/llvm-project/blob/fb8d894f23c5e805f0c87d89fb9d6c0eed3a0e72/compiler-rt/lib/ubsan/ubsan_handlers.h#L90-L91, and yet we generate a call to it that will fall off the end of the function when that callback returns.

Using "recoverable" UBSan doesn't actually affect the code we generate after the diagnostic, and it doesn't suppress compiler optimizations that turn provably undefined code into "unreachable". So often the code blows up anyway. Maybe we should try to do something different here in some cases. For out-of-bounds specifically, I'm not sure what a fix for that would look like, though; there isn't any intuitive behavior for an out-of-bounds access.

Can you or @efriedma demonstrate a concrete case you're thinking of? I've tried:

The cases I'm thinking of are cases which *do* actually appear to fall through, but don't because of invariants that aren't visible to the compiler. Functions that look like void f() { __builtin_unreachable() } actually exist in LLVM. (The pattern tends to show up with virtual functions...) Or a function may or may not be noreturn depending on the argument values. Or jump threading generated a codepath that has unconditional undefined behavior, but we can't actually prove it because there's a call to a function that isn't marked willreturn.

Have you tried building LLVM with this to see what shows up?

ojeda added a subscriber: ojeda.Mar 23 2023, 2:37 AM

In D146466#4213814, @nickdesaulniers wrote:

In D146466#4210607, @aaron.ballman wrote:

In D146466#4208216, @dblaikie wrote:

If the issue is with the actual fallthrough - that's been discussed elsewhere recently, and I think it'd be reasonable to turn on TrapOnUnreachable everywhere, basically (with NoTrapAfterNoreturn too, perhaps, to keep the cost low) so that these trivial fallthroughs hit int3 instead of going on to the next function.

As I understand things, folks are confused about the runtime behavior of falling through and magically calling some other function. If we can warn about that situation from the FE without false positives, then huttah, I think that's sufficient. But I don't think we know of a way to do that from the FE, so I think ensuring a runtime trap is more helpful to folks than the status quo. (Also, in general, I think we want to avoid emitting diagnostics from the backend -- that has usability issues where clean -fsyntax-only builds suddenly get warnings when doing full codegen, or in circumstances where the optimization level changes the diagnostic output, etc.)

The issue is that there are problems that may occur as a result of optimizations; if the front end does not do optimizations, it cannot diagnose these. Some examples include:

-Wframe-larger-than= (useful for embedded systems without "nearly unlimited" stack, though an imperfect proxy for maximal stack depth, which is what these systems actually want. Helpful for spotting large stack allocations that should come from the heap or static memory).

-Wattribute-warning (some implementations of _FORTIFY_SOURCE such as the Linux kernel's rely on this; specifically that calls to functions with these attributes get optimized out. This provides significant protections during compile time when due to optimizations we have an out of bounds access, when compared to implementations that simply use _Static_assert)

-Wfunction-fallthrough (hypothetical new diagnostic this patch is adding)

Even if Clang had a high level IR and did optimizations, it still could not guarantee the results of LLVM's subsequent optimizations.

Does that mean that -Wframe-larger-than= (and the like) should not exist because it's not diagnosable via -fsyntax-only?

I do think that there are a class of diagnostics that are useful to users that depend on optimizations being run. While a policy around new diagnostics being diagnosable via -fsyntax-only is a simplistic litmus test, I do think it allows for our competition to implement diagnostics that may be more useful to users. This hamstrings our competitiveness, IMO.

Okay, was not very thorough in explaining my concerns about diagnostics coming out of the optimizer. -fsyntax-only behavior is one facet to the issue, but the problem boils down to "mysterious compiler behavior" as far as the user is concerned. Other examples of this same problem are:

User is compiling everything locally with their debug build (-O0) and everything compiles fine. They commit it and it gets built as a release build (-O2 or -O3), and now they get alerted to problems.
User is compiling everything locally with their optimized build and everything compiles fine. They commit it and it gets built on another machine with significantly more resources than the developer's machine had, so optimization decisions are different and not they get alerted to problems.
User is compiling everything locally with their optimized build and everything compiles fine, they close their web browser, fix a typo in a comment, and recompile and now they get alerted to problems.
Any use of -Werror with optimization-based diagnostics is effectively playing compiler russian roulette.

The root of the problem is that we want diagnostics to be reproducible, and diagnostics generated by the optimizer often are not reliably reproducible. GCC has already gone down this path, and it causes confusion in practice:

https://stackoverflow.com/questions/59388740/gcc-shows-different-warnings-depending-on-optimisation-level
https://stackoverflow.com/questions/65049644/gcc-warning-about-unitialized-values-caused-by-turning-optimization-on
https://developers.redhat.com/blog/2019/03/13/understanding-gcc-warnings-part-2#

That's why we've traditionally limited the backend to producing remarks instead of warnings or errors. It's a tradeoff between user experience decisions. That's not to say there's never a valid reason to diagnose from the backend, but I think we need to be very mindful about the user experience when we add them. Linux kernel developers are one class of user (and likely don't have too much confusion over optimization-based diagnostics), but we've got plenty of users for whom the compiler is effectively a black box, and those are the folks I worry about with optimization-based diagnostics.

In D146466#4214770, @efriedma wrote:

Note how __ubsan_handle_out_of_bounds is literally marked RECOVERABLE in https://github.com/llvm/llvm-project/blob/fb8d894f23c5e805f0c87d89fb9d6c0eed3a0e72/compiler-rt/lib/ubsan/ubsan_handlers.h#L90-L91, and yet we generate a call to it that will fall off the end of the function when that callback returns.

Using "recoverable" UBSan doesn't actually affect the code we generate after the diagnostic, and it doesn't suppress compiler optimizations that turn provably undefined code into "unreachable". So often the code blows up anyway. Maybe we should try to do something different here in some cases. For out-of-bounds specifically, I'm not sure what a fix for that would look like, though; there isn't any intuitive behavior for an out-of-bounds access.

Recoverable is important concept for ubsan callbacks; some are expected for the program to continue some are not. If we enable UBSan to help us find UB in our program, but the presence of said UB causes our program to go off the rails, then that would not be considered recoverable. A less charitable interpretation would be to call that sanitizer broken.

I wonder if the issue I'm seeing in https://godbolt.org/z/aaj6KTrPe is actually something better solved by additional use of freeze on poison? (uh oh, I've summoned Cthulhu (and the Cenobites) by uttering those two terms in the same sentence! A blood sacrifice is necessary!)

For example, where does the unreachable come from in that example?

SCCPPass is turning a br i1 poison, ... into an unreachable.

Where did the branch on poison come from?

LoopFullUnrollPass introduced a bunch of branches on poison when unrolling the loop.

I'd bet that if we "froze" the poison first before branching on it (and only did so when -fsanitize=array-bounds was enabled) then we'd never end up replacing the branch with unreachable, which is destroying our control flow and making our sanitizer not actually recoverable.

Come to think of it, the other test of "broken code" I've seen in the past (an alluded to upthread) was related to the divide by zero sanitizer.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc3&id=e5d523f1ae8f2cef01f8e071aeee432654166708
I should reduce a test case from that, and see if perhaps freeze is also a solution there, too. Then maybe warn on fallthrough becomes moot (or less interesting) if these are simply codegen bugs for various UBSanitizers.

There are limits to how much we can do by just changing the code clang generates... but the two particular cases you mention probably could be "fixed" by messing with the IR generated by clang. Sure, that probably makes sense to pursue. (If you're going to pick an arbitrary value, zero is probably a better choice than "freeze poison".)

In D146466#4217363, @efriedma wrote:

There are limits to how much we can do by just changing the code clang generates... but the two particular cases you mention probably could be "fixed" by messing with the IR generated by clang. Sure, that probably makes sense to pursue. (If you're going to pick an arbitrary value, zero is probably a better choice than "freeze poison".)

I don't think that could be a clang codegen of IR change. What I had in mind is closer to:

diff --git a/llvm/lib/Transforms/Utils/LoopUnroll.cpp b/llvm/lib/Transforms/Utils/LoopUnroll.cpp
index 916c02692823..aa1f288b51ee 100644
--- a/llvm/lib/Transforms/Utils/LoopUnroll.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUnroll.cpp
@@ -236,9 +236,17 @@ void llvm::simplifyLoopAfterUnroll(Loop *L, bool SimplifyIVs, LoopInfo *LI,
   SmallVector<WeakTrackingVH, 16> DeadInsts;
   for (BasicBlock *BB : L->getBlocks()) {
     for (Instruction &Inst : llvm::make_early_inc_range(*BB)) {
-      if (Value *V = simplifyInstruction(&Inst, {DL, nullptr, DT, AC}))
-        if (LI->replacementPreservesLCSSAForm(&Inst, V))
+      if (Value *V = simplifyInstruction(&Inst, {DL, nullptr, DT, AC})) {
+        // TODO: check if sanitizer is enabled
+        if (isa<PoisonValue>(V) && isa<LoadInst>(Inst)) {
+          errs() << "XXX: found poison (replacing a load)!\n";
+          FreezeInst *FI = new FreezeInst(V);
+          FI->insertInto(BB, Inst.getIterator());
+          if (LI->replacementPreservesLCSSAForm(FI, V))
+            Inst.replaceAllUsesWith(FI);
+        } else if (LI->replacementPreservesLCSSAForm(&Inst, V))
           Inst.replaceAllUsesWith(V);
+      }
       if (isInstructionTriviallyDead(&Inst))
         DeadInsts.emplace_back(&Inst);
     }

Which seems to do exactly what I want (minus only doing so when -fsanitize=array-bounds is enabled; not sure we leave such breadcrumbs behind in IR; still looking to see if we can tell in LLVM when specific sanitizers are enabled)

The problem with a change like that is that it's not clear what the underlying semantic model is. If we add a flag that says "try to generate code that matches Linux kernel developers' mental model of the underlying machine", or "loop unrolling should try to preserve undefined behavior", or "out-of-bounds memory accesses are well-defined", the semantics are unclear. The point of having well-defined IR is that a developer can tell whether a transform is legal or not based on reading LangRef, as opposed to trying to blindly guess whether a transform is going to break some user's code.

Making clang generate different IR for specific load/division/etc. operators is much easier to specify; IR has the same meaning it always has, we're just generating IR that's well-defined in more cases.

In D146466#4217487, @efriedma wrote:

The problem with a change like that is that it's not clear what the underlying semantic model is. If we add a flag that says "try to generate code that matches Linux kernel developers' mental model of the underlying machine", or "loop unrolling should try to preserve undefined behavior", or "out-of-bounds memory accesses are well-defined", the semantics are unclear. The point of having well-defined IR is that a developer can tell whether a transform is legal or not based on reading LangRef, as opposed to trying to blindly guess whether a transform is going to break some user's code.

I think it needs to be making specific cases of UB defined when specific santitizers trying to catch those UB cases are enabled, if those sanitizer call backs are expected to be recoverable (i.e. the ones that are not noreturn). (This reminds me of a transform I think LLVM does where the blockaddress of a removed basic block gets replaced with 1).

Making clang generate different IR for specific load/division/etc. operators is much easier to specify; IR has the same meaning it always has, we're just generating IR that's well-defined in more cases.

So perhaps if -fsanitize=array-bounds was enabled, clang would add metadata to each load to say "this is an array bounds sanitizer load" (and metadata for udiv when -fsanitize=integer-divide-by-zero)? I think then we'd still need to teach simplifyInstruction to look for such metadata and do something different (like replace with freeze or 0 or 1 or whatever).

Or did you have something else in mind?

(Is it worth moving this discussion to another RFC? I would like to ask folks from https://users.cs.utah.edu/~regehr/papers/undef-pldi17.pdf their thoughts on freeze wrt this issue.)

@nickdesaulniers I don't think we want to handle that on the LLVM side. That will be fundamentally unreliable. If Clang wishes to make "recoverable" ubsan have arbitrary but well-defined behavior in the case of UB, it needs to generate appropriate IR to model that. For example, instead of generating

if (x == 0) {
  report_div_by_zero();
}
res = y / x;

it needs to generate

if (x == 0) {
  report_div_by_zero();
  res = 0; // or any other value
} else {
  res = y / x;
}

And similarly for other cases.

(recoverable feels like a bit of a distraction here? recoverable just means you've asked ubsan not to trap/stop on failure - but to let the program continue and do whatever it would've done without the sanitizer enabled - sometimes that's crash/trap anyway, sometimes it's something less bad... but that's all that's being asked for: "keep going/do whatever you'd do without the sanitizer enabled, rather than hard stop as soon as the sanitizer detects a problem" - no, we shouldn't recover differently/more safely with sanitizers enabled (don't want to create a language variant/encourage people to build incorrect programs with sanitizers and run them that way because "it works"))

but we could/should trap at the end of a function that lacks a valid return, if the end is reachable by local reasoning (ie: maybe we still trap after noreturn - alternatively we could guarantee to include a trap at the end of a noreturn so it can't return instead of trapping after noreturn? (wouldn't be 100% rigorous, because you could be mix-and-matching compilers, but might be the right tradeoff in terms of size/safety))

I think the reason "recoverable" ubsan causes trouble is that it introduces branches that subsequent optimizations can abuse. So without ubsan, we just have an udiv instruction. With ubsan, we conveniently have a branch on exactly the condition that would make the udiv undefined, so we can easily prove control flow doesn't continue after the ubsan handler. Subsequent optimizations take advantage of that, so ubsan "breaks" code. (So the code was never actually correct according to the semantic model, but it was broken in a way the compiler is less likely optimize.)

In D146466#4218011, @efriedma wrote:

I think the reason "recoverable" ubsan causes trouble is that it introduces branches that subsequent optimizations can abuse. So without ubsan, we just have an udiv instruction. With ubsan, we conveniently have a branch on exactly the condition that would make the udiv undefined, so we can easily prove control flow doesn't continue after the ubsan handler. Subsequent optimizations take advantage of that, so ubsan "breaks" code. (So the code was never actually correct according to the semantic model, but it was broken in a way the compiler is less likely optimize.)

Ah, interesting - thanks for the explanation!

(but, yeah, not sure we can/should provide any further guarantees beyond "does /something/ after the sanitizer failure... " don't get stronger guarantees than without the sanitizer & while worth documenting that the sanitizer could make the problems after "recovery" worse than without the sanitizer enabled)

Revision Contents

Path

Size

clang/

docs/

ReleaseNotes.rst

2 lines

include/

clang/

Basic/

DiagnosticFrontendKinds.td

3 lines

DiagnosticGroups.td

6 lines

lib/

CodeGen/

CodeGenAction.cpp

13 lines

test/

Frontend/

backend-function-fallthrough.c

13 lines

llvm/

include/

llvm/

IR/

DiagnosticInfo.h

14 lines

lib/

CodeGen/

AsmPrinter/

AsmPrinter.cpp

27 lines

IR/

DiagnosticInfo.cpp

5 lines

test/

CodeGen/

AArch64/

function-fallthrough.ll

8 lines

Diff 506760

clang/docs/ReleaseNotes.rst

	Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines
	- Diagnostics relating to macros on the command line of a preprocessed assembly			- Diagnostics relating to macros on the command line of a preprocessed assembly
	file or precompiled header are now reported as coming from the file			file or precompiled header are now reported as coming from the file
	``<command line>`` instead of ``<built-in>``.			``<command line>`` instead of ``<built-in>``.
	- Clang constexpr evaluator now provides a more concise diagnostic when calling			- Clang constexpr evaluator now provides a more concise diagnostic when calling
	function pointer that is known to be null.			function pointer that is known to be null.
	- Clang now avoids duplicate warnings on unreachable ``[[fallthrough]];`` statements			- Clang now avoids duplicate warnings on unreachable ``[[fallthrough]];`` statements
	previously issued from ``-Wunreachable-code`` and ``-Wunreachable-code-fallthrough``			previously issued from ``-Wunreachable-code`` and ``-Wunreachable-code-fallthrough``
	by prioritizing ``-Wunreachable-code-fallthrough``.			by prioritizing ``-Wunreachable-code-fallthrough``.
				- New diagnostic ``-Wfunction-fallthrough`` warns when a function falls through
				to the next function.

	Bug Fixes in This Version			Bug Fixes in This Version
	-------------------------			-------------------------

	- Fix crash when diagnosing incorrect usage of ``_Nullable`` involving alias			- Fix crash when diagnosing incorrect usage of ``_Nullable`` involving alias
	templates.			templates.
	(`#60344 <https://github.com/llvm/llvm-project/issues/60344>`_)			(`#60344 <https://github.com/llvm/llvm-project/issues/60344>`_)
	- Fix confusing warning message when ``/clang:-x`` is passed in ``clang-cl``			- Fix confusing warning message when ``/clang:-x`` is passed in ``clang-cl``
	▲ Show 20 Lines • Show All 193 Lines • Show Last 20 Lines

clang/include/clang/Basic/DiagnosticFrontendKinds.td

	Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
	def err_fe_backend_unsupported : Error<"%0">, BackendInfo;			def err_fe_backend_unsupported : Error<"%0">, BackendInfo;
	def warn_fe_backend_unsupported : Warning<"%0">, BackendInfo;			def warn_fe_backend_unsupported : Warning<"%0">, BackendInfo;

	def err_fe_backend_error_attr :			def err_fe_backend_error_attr :
	Error<"call to '%0' declared with 'error' attribute: %1">, BackendInfo;			Error<"call to '%0' declared with 'error' attribute: %1">, BackendInfo;
	def warn_fe_backend_warning_attr :			def warn_fe_backend_warning_attr :
	Warning<"call to '%0' declared with 'warning' attribute: %1">, BackendInfo,			Warning<"call to '%0' declared with 'warning' attribute: %1">, BackendInfo,
	InGroup<BackendWarningAttributes>;			InGroup<BackendWarningAttributes>;
				def warn_fe_function_fallthrough:
				Warning<"function '%0' falls through to next function">,
				InGroup<BackendWarningFunctionFallthrough>;

	def err_fe_invalid_code_complete_file : Error<			def err_fe_invalid_code_complete_file : Error<
	"cannot locate code-completion file %0">, DefaultFatal;			"cannot locate code-completion file %0">, DefaultFatal;
	def err_fe_dependency_file_requires_MT : Error<			def err_fe_dependency_file_requires_MT : Error<
	"-dependency-file requires at least one -MT or -MQ option">;			"-dependency-file requires at least one -MT or -MQ option">;
	def err_fe_invalid_plugin_name : Error<			def err_fe_invalid_plugin_name : Error<
	"unable to find plugin '%0'">;			"unable to find plugin '%0'">;
	def err_fe_expected_compiler_job : Error<			def err_fe_expected_compiler_job : Error<
	▲ Show 20 Lines • Show All 244 Lines • Show Last 20 Lines

clang/include/clang/Basic/DiagnosticGroups.td

	Show First 20 Lines • Show All 1,290 Lines • ▼ Show 20 Lines
	def : DiagGroup<"frame-larger-than=", [BackendFrameLargerThan]>;			def : DiagGroup<"frame-larger-than=", [BackendFrameLargerThan]>;
	def BackendPlugin : DiagGroup<"backend-plugin">;			def BackendPlugin : DiagGroup<"backend-plugin">;
	def RemarkBackendPlugin : DiagGroup<"remark-backend-plugin">;			def RemarkBackendPlugin : DiagGroup<"remark-backend-plugin">;
	def BackendOptimizationRemark : DiagGroup<"pass">;			def BackendOptimizationRemark : DiagGroup<"pass">;
	def BackendOptimizationRemarkMissed : DiagGroup<"pass-missed">;			def BackendOptimizationRemarkMissed : DiagGroup<"pass-missed">;
	def BackendOptimizationRemarkAnalysis : DiagGroup<"pass-analysis">;			def BackendOptimizationRemarkAnalysis : DiagGroup<"pass-analysis">;
	def BackendOptimizationFailure : DiagGroup<"pass-failed">;			def BackendOptimizationFailure : DiagGroup<"pass-failed">;
	def BackendWarningAttributes : DiagGroup<"attribute-warning">;			def BackendWarningAttributes : DiagGroup<"attribute-warning">;
				def BackendWarningFunctionFallthrough : DiagGroup<"function-fallthrough">{
				code Documentation = [{
				The function falls through to the next function. This can occur due to either
				the incorrect use of __builtin_unreachable() or potentially a bug in LLVM.
				}];
				}

	// Instrumentation based profiling warnings.			// Instrumentation based profiling warnings.
	def ProfileInstrMissing : DiagGroup<"profile-instr-missing">;			def ProfileInstrMissing : DiagGroup<"profile-instr-missing">;
	def ProfileInstrOutOfDate : DiagGroup<"profile-instr-out-of-date">;			def ProfileInstrOutOfDate : DiagGroup<"profile-instr-out-of-date">;
	def ProfileInstrUnprofiled : DiagGroup<"profile-instr-unprofiled">;			def ProfileInstrUnprofiled : DiagGroup<"profile-instr-unprofiled">;
	def MisExpect : DiagGroup<"misexpect">;			def MisExpect : DiagGroup<"misexpect">;

	// AddressSanitizer frontend instrumentation remarks.			// AddressSanitizer frontend instrumentation remarks.
	▲ Show 20 Lines • Show All 115 Lines • Show Last 20 Lines

clang/lib/CodeGen/CodeGenAction.cpp

Show First 20 Lines • Show All 456 Lines • ▼ Show 20 Lines	public:
void OptimizationRemarkHandler(		void OptimizationRemarkHandler(
const llvm::OptimizationRemarkAnalysisAliasing &D);		const llvm::OptimizationRemarkAnalysisAliasing &D);
void OptimizationFailureHandler(		void OptimizationFailureHandler(
const llvm::DiagnosticInfoOptimizationFailure &D);		const llvm::DiagnosticInfoOptimizationFailure &D);
void DontCallDiagHandler(const DiagnosticInfoDontCall &D);		void DontCallDiagHandler(const DiagnosticInfoDontCall &D);
/// Specialized handler for misexpect warnings.		/// Specialized handler for misexpect warnings.
/// Note that misexpect remarks are emitted through ORE		/// Note that misexpect remarks are emitted through ORE
void MisExpectDiagHandler(const llvm::DiagnosticInfoMisExpect &D);		void MisExpectDiagHandler(const llvm::DiagnosticInfoMisExpect &D);
		void
		FunctionFallthroughHandler(const DiagnosticInfoFunctionFallthrough &DI);
};		};

void BackendConsumer::anchor() {}		void BackendConsumer::anchor() {}
}		}

bool ClangDiagnosticHandler::handleDiagnostics(const DiagnosticInfo &DI) {		bool ClangDiagnosticHandler::handleDiagnostics(const DiagnosticInfo &DI) {
BackendCon->DiagnosticHandlerImpl(DI);		BackendCon->DiagnosticHandlerImpl(DI);
return true;		return true;
▲ Show 20 Lines • Show All 397 Lines • ▼ Show 20 Lines	if (BadDebugInfo)
// If we were not able to translate the file:line:col information		// If we were not able to translate the file:line:col information
// back to a SourceLocation, at least emit a note stating that		// back to a SourceLocation, at least emit a note stating that
// we could not translate this location. This can happen in the		// we could not translate this location. This can happen in the
// case of #line directives.		// case of #line directives.
Diags.Report(Loc, diag::note_fe_backend_invalid_loc)		Diags.Report(Loc, diag::note_fe_backend_invalid_loc)
<< Filename << Line << Column;		<< Filename << Line << Column;
}		}

		void BackendConsumer::FunctionFallthroughHandler(
		const DiagnosticInfoFunctionFallthrough &DI) {
		if (std::optional<FullSourceLoc> Loc =
		getFunctionSourceLocation(DI.getFunction()))
		Diags.Report(*Loc, diag::warn_fe_function_fallthrough)
		<< DI.getFunction().getName();
		}

/// This function is invoked when the backend needs		/// This function is invoked when the backend needs
/// to report something to the user.		/// to report something to the user.
void BackendConsumer::DiagnosticHandlerImpl(const DiagnosticInfo &DI) {		void BackendConsumer::DiagnosticHandlerImpl(const DiagnosticInfo &DI) {
unsigned DiagID = diag::err_fe_inline_asm;		unsigned DiagID = diag::err_fe_inline_asm;
llvm::DiagnosticSeverity Severity = DI.getSeverity();		llvm::DiagnosticSeverity Severity = DI.getSeverity();
// Get the diagnostic ID based.		// Get the diagnostic ID based.
switch (DI.getKind()) {		switch (DI.getKind()) {
case llvm::DK_InlineAsm:		case llvm::DK_InlineAsm:
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	case llvm::DK_Unsupported:
UnsupportedDiagHandler(cast<DiagnosticInfoUnsupported>(DI));		UnsupportedDiagHandler(cast<DiagnosticInfoUnsupported>(DI));
return;		return;
case llvm::DK_DontCall:		case llvm::DK_DontCall:
DontCallDiagHandler(cast<DiagnosticInfoDontCall>(DI));		DontCallDiagHandler(cast<DiagnosticInfoDontCall>(DI));
return;		return;
case llvm::DK_MisExpect:		case llvm::DK_MisExpect:
MisExpectDiagHandler(cast<DiagnosticInfoMisExpect>(DI));		MisExpectDiagHandler(cast<DiagnosticInfoMisExpect>(DI));
return;		return;
		case llvm::DK_FunctionFallthrough:
		FunctionFallthroughHandler(cast<DiagnosticInfoFunctionFallthrough>(DI));
		return;
default:		default:
// Plugin IDs are not bound to any value as they are set dynamically.		// Plugin IDs are not bound to any value as they are set dynamically.
ComputeDiagRemarkID(Severity, backend_plugin, DiagID);		ComputeDiagRemarkID(Severity, backend_plugin, DiagID);
break;		break;
}		}
std::string MsgStorage;		std::string MsgStorage;
{		{
raw_string_ostream Stream(MsgStorage);		raw_string_ostream Stream(MsgStorage);
▲ Show 20 Lines • Show All 307 Lines • Show Last 20 Lines

clang/test/Frontend/backend-function-fallthrough.c

This file was added.

				// RUN: %clang_cc1 -O0 -verify -emit-codegen-only %s
				// RUN: %clang_cc1 -O2 -verify=expected,optimization -emit-codegen-only %s
				void foo (void);
				int bar (int x) { // optimization-warning {{function 'bar' falls through to next function}}
				if (x) {
				foo();
				__builtin_unreachable();
				}
				return 42;
				}
				void baz (void) { // expected-warning {{function 'baz' falls through to next function}}
				__builtin_unreachable();
				}

llvm/include/llvm/IR/DiagnosticInfo.h

Show All 14 Lines
#define LLVM_IR_DIAGNOSTICINFO_H		#define LLVM_IR_DIAGNOSTICINFO_H

#include "llvm-c/Types.h"		#include "llvm-c/Types.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/IR/DebugLoc.h"		#include "llvm/IR/DebugLoc.h"
		#include "llvm/IR/Function.h"
#include "llvm/Support/CBindingWrapping.h"		#include "llvm/Support/CBindingWrapping.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/SourceMgr.h"		#include "llvm/Support/SourceMgr.h"
#include "llvm/Support/TypeSize.h"		#include "llvm/Support/TypeSize.h"
#include <algorithm>		#include <algorithm>
#include <cstdint>		#include <cstdint>
#include <functional>		#include <functional>
#include <iterator>		#include <iterator>
#include <optional>		#include <optional>
#include <string>		#include <string>

namespace llvm {		namespace llvm {

// Forward declarations.		// Forward declarations.
class DiagnosticPrinter;		class DiagnosticPrinter;
class DIFile;		class DIFile;
class DISubprogram;		class DISubprogram;
class CallInst;		class CallInst;
class Function;
class Instruction;		class Instruction;
class InstructionCost;		class InstructionCost;
class Module;		class Module;
class Type;		class Type;
class Value;		class Value;

/// Defines the different supported severity of a diagnostic.		/// Defines the different supported severity of a diagnostic.
enum DiagnosticSeverity : char {		enum DiagnosticSeverity : char {
Show All 31 Lines	enum DiagnosticKind {
DK_FirstMachineRemark = DK_MachineOptimizationRemark,		DK_FirstMachineRemark = DK_MachineOptimizationRemark,
DK_LastMachineRemark = DK_MachineOptimizationRemarkAnalysis,		DK_LastMachineRemark = DK_MachineOptimizationRemarkAnalysis,
DK_MIRParser,		DK_MIRParser,
DK_PGOProfile,		DK_PGOProfile,
DK_Unsupported,		DK_Unsupported,
DK_SrcMgr,		DK_SrcMgr,
DK_DontCall,		DK_DontCall,
DK_MisExpect,		DK_MisExpect,
		DK_FunctionFallthrough,
DK_FirstPluginKind // Must be last value to work with		DK_FirstPluginKind // Must be last value to work with
// getNextAvailablePluginDiagnosticKind		// getNextAvailablePluginDiagnosticKind
};		};

/// Get the next available kind ID for a plugin diagnostic.		/// Get the next available kind ID for a plugin diagnostic.
/// Each time this function is called, it returns a different number.		/// Each time this function is called, it returns a different number.
/// Therefore, a plugin that wants to "identify" its own classes		/// Therefore, a plugin that wants to "identify" its own classes
/// with a dynamic identifier, just have to use this method to get a new ID		/// with a dynamic identifier, just have to use this method to get a new ID
▲ Show 20 Lines • Show All 949 Lines • ▼ Show 20 Lines	public:

const Twine &getMsg() const { return Msg; }		const Twine &getMsg() const { return Msg; }

private:		private:
/// Message to report.		/// Message to report.
const Twine &Msg;		const Twine &Msg;
};		};

		struct DiagnosticInfoFunctionFallthrough
		: public DiagnosticInfoWithLocationBase {
		DiagnosticInfoFunctionFallthrough(Function &F)
		: DiagnosticInfoWithLocationBase(DK_FunctionFallthrough, DS_Warning, F,
		F.getSubprogram()) {}
		static bool classof(const DiagnosticInfo *DI) {
		return DI->getKind() == DK_FunctionFallthrough;
		}
		void print(DiagnosticPrinter &DP) const override;
		};

static DiagnosticSeverity getDiagnosticSeverity(SourceMgr::DiagKind DK) {		static DiagnosticSeverity getDiagnosticSeverity(SourceMgr::DiagKind DK) {
switch (DK) {		switch (DK) {
case llvm::SourceMgr::DK_Error:		case llvm::SourceMgr::DK_Error:
return DS_Error;		return DS_Error;
break;		break;
case llvm::SourceMgr::DK_Warning:		case llvm::SourceMgr::DK_Warning:
return DS_Warning;		return DS_Warning;
break;		break;
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines
#include "llvm/Config/config.h"		#include "llvm/Config/config.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Comdat.h"		#include "llvm/IR/Comdat.h"
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DebugInfoMetadata.h"		#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
		#include "llvm/IR/DiagnosticInfo.h"
#include "llvm/IR/EHPersonalities.h"		#include "llvm/IR/EHPersonalities.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/GCStrategy.h"		#include "llvm/IR/GCStrategy.h"
#include "llvm/IR/GlobalAlias.h"		#include "llvm/IR/GlobalAlias.h"
#include "llvm/IR/GlobalIFunc.h"		#include "llvm/IR/GlobalIFunc.h"
#include "llvm/IR/GlobalObject.h"		#include "llvm/IR/GlobalObject.h"
#include "llvm/IR/GlobalValue.h"		#include "llvm/IR/GlobalValue.h"
#include "llvm/IR/GlobalVariable.h"		#include "llvm/IR/GlobalVariable.h"
▲ Show 20 Lines • Show All 2,306 Lines • ▼ Show 20 Lines

MCSymbol *AsmPrinter::getMBBExceptionSym(const MachineBasicBlock &MBB) {		MCSymbol *AsmPrinter::getMBBExceptionSym(const MachineBasicBlock &MBB) {
auto Res = MBBSectionExceptionSyms.try_emplace(MBB.getSectionIDNum());		auto Res = MBBSectionExceptionSyms.try_emplace(MBB.getSectionIDNum());
if (Res.second)		if (Res.second)
Res.first->second = createTempSymbol("exception");		Res.first->second = createTempSymbol("exception");
return Res.first->second;		return Res.first->second;
}		}

		static void diagnoseFunctionFallthrough(MachineFunction &MF) {
		bool FT = false;
		MachineBasicBlock &LastMBB = MF.back();
		if (LastMBB.canFallThrough())
		FT = true;
		// TODO: MachineBasicBlock::canFallThrough is too conservative; it will not
		// catch this!
		else if (LastMBB.empty())
		FT = true;
		// TODO: TII::isTailCall?
		else if (LastMBB.succ_empty() && !LastMBB.isReturnBlock())
		// TODO: MachineBasicBlock::canFallThrough is too conservative; it will not
		// catch this!
		// Don't call MachineBasicBlock::back() without a
		// !MachineBasicBlock::empty() guard.
		if (const Function *Callee = LastMBB.back().getCalledFunction())
		FT = !Callee->hasFnAttribute(Attribute::NoReturn);
		if (FT) {
		Function &F = MF.getFunction();
		DiagnosticInfoFunctionFallthrough D(F);
		F.getContext().diagnose(D);
		}
		}

void AsmPrinter::SetupMachineFunction(MachineFunction &MF) {		void AsmPrinter::SetupMachineFunction(MachineFunction &MF) {
this->MF = &MF;		this->MF = &MF;
const Function &F = MF.getFunction();		const Function &F = MF.getFunction();

// Record that there are split-stack functions, so we will emit a special		// Record that there are split-stack functions, so we will emit a special
// section to tell the linker.		// section to tell the linker.
if (MF.shouldSplitStack()) {		if (MF.shouldSplitStack()) {
HasSplitStack = true;		HasSplitStack = true;
Show All 32 Lines	if (F.hasFnAttribute("patchable-function-entry") \|\|
needFuncLabels(MF) \|\| NeedsLocalForSize \|\|		needFuncLabels(MF) \|\| NeedsLocalForSize \|\|
MF.getTarget().Options.EmitStackSizeSection \|\| MF.hasBBLabels()) {		MF.getTarget().Options.EmitStackSizeSection \|\| MF.hasBBLabels()) {
CurrentFnBegin = createTempSymbol("func_begin");		CurrentFnBegin = createTempSymbol("func_begin");
if (NeedsLocalForSize)		if (NeedsLocalForSize)
CurrentFnSymForSize = CurrentFnBegin;		CurrentFnSymForSize = CurrentFnBegin;
}		}

ORE = &getAnalysis<MachineOptimizationRemarkEmitterPass>().getORE();		ORE = &getAnalysis<MachineOptimizationRemarkEmitterPass>().getORE();

		diagnoseFunctionFallthrough(MF);
}		}

namespace {		namespace {

// Keep track the alignment, constpool entries per Section.		// Keep track the alignment, constpool entries per Section.
struct SectionCPs {		struct SectionCPs {
MCSection *S;		MCSection *S;
Align Alignment;		Align Alignment;
▲ Show 20 Lines • Show All 1,687 Lines • Show Last 20 Lines

llvm/lib/IR/DiagnosticInfo.cpp

Show First 20 Lines • Show All 444 Lines • ▼ Show 20 Lines	DP << "call to " << demangle(getFunctionName().str())
<< " marked \"dontcall-";		<< " marked \"dontcall-";
if (getSeverity() == DiagnosticSeverity::DS_Error)		if (getSeverity() == DiagnosticSeverity::DS_Error)
DP << "error\"";		DP << "error\"";
else		else
DP << "warn\"";		DP << "warn\"";
if (!getNote().empty())		if (!getNote().empty())
DP << ": " << getNote();		DP << ": " << getNote();
}		}

		void DiagnosticInfoFunctionFallthrough::print(DiagnosticPrinter &DP) const {
		DP << "function '" << getFunction().getName()
		<< "' falls through to next function";
		}

llvm/test/CodeGen/AArch64/function-fallthrough.ll

This file was added.

				; RUN: llc %s 2>&1 \| FileCheck %s
				declare void @foo ()

				define void @bar () {
				; CHECK: warning: function 'bar' falls through to next function
				entry:
				unreachable
				nickdesaulniersAuthorUnsubmitted Not Done Reply Inline Actions TODO: add test that the final MBB is a call to a noreturn function followed by unreachable nickdesaulniers: TODO: add test that the final MBB is a call to a noreturn function followed by unreachable
				}

This is an archive of the discontinued LLVM Phabricator instance.

[clang] diagnose function fallthroughNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 506760

clang/docs/ReleaseNotes.rst

clang/include/clang/Basic/DiagnosticFrontendKinds.td

clang/include/clang/Basic/DiagnosticGroups.td

clang/lib/CodeGen/CodeGenAction.cpp

clang/test/Frontend/backend-function-fallthrough.c

llvm/include/llvm/IR/DiagnosticInfo.h

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

llvm/lib/IR/DiagnosticInfo.cpp

llvm/test/CodeGen/AArch64/function-fallthrough.ll

[clang] diagnose function fallthrough
Needs ReviewPublic