This is an archive of the discontinued LLVM Phabricator instance.

[Sanitizers] UBSan unreachable incompatible with ASan in the presence of `noreturn` calls
AbandonedPublic

Authored by yln on Jan 11 2019, 5:02 PM.

Details

Summary

UBSan wants to detect when unreachable code is actually reached, so it adds instrumentation before every unreachable instruction. However, the optimizer will remove code after calls to functions marked with noreturn. To avoid this UBSan removes noreturn from both the call instruction as well as from the function itself. Unfortunately, ASan relies on this annotation to unpoison the stack by inserting calls to _asan_handle_no_return before noreturn functions. This is important for functions that do not return but access the the stack memory, e.g., unwinder functions *like* longjmp (longjmp itself is actually "double-proofed" via its interceptor). The result is that when ASan and UBSan are combined, the noreturn attributes are missing and ASan cannot unpoison the stack, so it has false positives when stack unwinding is used.

Changes:

  1. UBSan now adds the expect_noreturn attribute whenever it removes the noreturn attribute from a function
  2. ASan additionally checks for the presence of this attribute

Generated code:

call void @__asan_handle_no_return    // Additionally inserted to avoid false positives
call void @longjmp
call void @__asan_handle_no_return
call void @__ubsan_handle_builtin_unreachable
unreachable

Is the second call to __asan_handle_no_return redundant?
If we care about this, then I can provide a follow-up patch dealing with this.

rdar://problem/40723397

Diff Detail

Event Timeline

yln created this revision.Jan 11 2019, 5:02 PM

@yln A few comments on the patch description.

ASan relies on this annotation to unpoisen the stack

s/unpoisen/unpoison

  1. (unrelated change) ASan now also only instruments calls that are not marked nosanitize

I'm confused. If this change is unrelated to problem we are trying to fix then why is it in this patch?

delcypher added inline comments.Jan 17 2019, 5:15 AM
clang/lib/CodeGen/CGCall.cpp
4398

I feel like this comment needs more explanation. Where is hasFnAttr called here? It's not at all obvious to me.

4404

This really deserves a comment. Maybe something like

// Sanitizer instrumentation passes (e.g. ASan) still need to know that this function won't return.
clang/lib/CodeGen/SanitizerMetadata.cpp
95 ↗(On Diff #181394)

Perhaps add a comment in here describing the semantics of this attribute and explaining why its necessary (i.e. llvm::Attribute::NoReturn might get stripped but we need this information to produce correct instrumentation).

clang/test/CodeGenCXX/ubsan-unreachable.cpp
7

This check assumes that sanitizer_noreturn is the first metadata attribute. Is that a safe assumption to make?

llvm/test/Instrumentation/AddressSanitizer/instrument-no-return.ll
1

Why did you drop -asan-module here?

delcypher added inline comments.Jan 17 2019, 5:17 AM
compiler-rt/test/asan/TestCases/ubsan_noreturn_compatibility.c
3 ↗(On Diff #181394)

This test should have a REQUIRES: check that checks for UBSan support.

delcypher added a comment.EditedJan 17 2019, 5:20 AM

Is it possible to write a compiler-rt lit test with a custom noreturn function that tries to touch the stack such that with the old Clang we'd get a false positive report (when building with ASan+UBSan) when we try to execute it?

Metadata seems like a wrong tool for this job because it can be discarded at will.
How about an intrinsic, ex. call @llvm.sanitizer.noreturn() before a noreturn call, which asan can replace with __asan_handle_noreturn, and other sanitizers can either handle or drop as necessary?

yln added a comment.Jan 17 2019, 2:52 PM

Metadata seems like a wrong tool for this job because it can be discarded at will.
How about an intrinsic, ex. call @llvm.sanitizer.noreturn() before a noreturn call, which asan can replace with __asan_handle_noreturn, and other sanitizers can either handle or drop as necessary?

Good point. Another option would be to replace the noreturn with another attribute. Do you have a preference between those two options?

I'm leaning toward the intrinsic, mainly because I've no idea what the new attribute should be called, nor what it would even mean.
ASan uses noreturn attribute as a proxy for "may do non-local jump".

The intrinsic must be declared to have side effects to prevent memory ops from sneaking between it and the original function call.

Hmm, how about "expect_noreturn" for an attribute name?

yln added a comment.Jan 17 2019, 5:01 PM

How about sanitizer_noreturn or noreturn_for_sanitizer. It should convey the same meaning as noreturn but has a specific audience: sanitizers.

Quick question about intrinsics (I don't know very much about them yet):

asan can replace with __asan_handle_noreturn, and other sanitizers can either handle or drop as necessary

Can this be structured in a way so that we can define in one place that the default is to drop/ignore and override the default for ASan?
If yes, I am happy to go with the intrinsic.

Otherwise, I prefer the attribute because the sanitizers that don't care can simply ignore it.
Thanks for your feedback! :)

In D56624#1362458, @yln wrote:

How about sanitizer_noreturn or noreturn_for_sanitizer. It should convey the same meaning as noreturn but has a specific audience: sanitizers.

It's fine, I guess. But I'm starting to like "expect_noreturn" - it basically tells you that the code following the call is cold. That's exactly what happens with ubsan, and there may even be optimizations that could take advantage of it. It has meaning outside of sanitizers. Could also be "unlikely_return" or something like that.

Quick question about intrinsics (I don't know very much about them yet):

asan can replace with __asan_handle_noreturn, and other sanitizers can either handle or drop as necessary

Can this be structured in a way so that we can define in one place that the default is to drop/ignore and override the default for ASan?
If yes, I am happy to go with the intrinsic.

SelectionDAG could be made to ignore this intrinsic if it ever sees it.

yln added a comment.Jan 17 2019, 7:01 PM

I like your approach with the attribute with a separate distinctive meaning outside of the sanitizers! I will update the patch accordingly.
About the name: to me, "unlikely_return" sounds more natural and "expect_noreturn" reminds me of "__builtin_expect", which is not a bad thing.

yln updated this revision to Diff 182654.Jan 18 2019, 5:57 PM
yln edited the summary of this revision. (Show Details)

Use a function attribute instead of metadata to mark functions which are unlikely to return. Update summary description.

Remove the unrelated change/optimization of not instrumenting calls with !nosanitize. I will provide a separate patch, if we decide that we want this.

yln marked 6 inline comments as done.Jan 18 2019, 6:09 PM
yln added inline comments.
compiler-rt/test/asan/TestCases/ubsan_noreturn_compatibility.c
3 ↗(On Diff #181394)

Good point! How can I make this work? llvm-lit always marks the test all unsupported for me.
I tried it with ubsan, asan-ubsan, ubsan-standalone.

llvm/test/Instrumentation/AddressSanitizer/instrument-no-return.ll
1

ASan is split into a function and module pass. The module pass is not required for this test.

yln marked an inline comment as done.Jan 18 2019, 6:10 PM
yln updated this revision to Diff 183142.Jan 23 2019, 11:45 AM

Update language reference documentation

yln marked an inline comment as done.Jan 23 2019, 12:01 PM

@eugenis: We now use an attribute expect_noreturn instead of metadata. Does that look good to you?

delcypher added inline comments.Jan 23 2019, 12:39 PM
llvm/docs/LangRef.rst
1463

s/correctenss/correctness/

1464

Suggested wording:

This function attribute indicates that the function is unlikely to return normally, but that it still allowed to do so.
This is useful in cases where ``noreturn`` is too strong a guarantee.

This patch is missing tests for the new attribute in AsmParser / BitcodeReader / BitcodeWriter.

call void @__asan_handle_no_return    // Additionally inserted to avoid false positives
call void @longjmp
call void @__asan_handle_no_return
call void @__ubsan_handle_builtin_unreachable

Yes, the second one is redundant. Is that because the ubsan call has noreturn attribute? Does it also have !nosanitize, and would it help in this case? The clang test should probably check for the attribute(s) here:

// CHECK: call void @__ubsan_handle_builtin_unreachable

In general, this looks fine.

compiler-rt/test/asan/TestCases/ubsan_noreturn_compatibility.c
3 ↗(On Diff #181394)

Does not asan always include ubsan?

delcypher added inline comments.Jan 23 2019, 2:06 PM
compiler-rt/test/asan/TestCases/ubsan_noreturn_compatibility.c
3 ↗(On Diff #181394)

How can I make this work?

If you put the test in UBSan test suite and add REQUIRES: ubsan-asan it should work as we build multiple combinations of sanitizers in that test suite, including UBSan+ASan.

3 ↗(On Diff #181394)

Does not asan always include ubsan?

Not necessarily. I believe there is support for RTEMS in the ASan code and sanitizer_common but I don't see any evidence of obvious evidence of support in the UBSan code. I'm just grepping for RTEMS so I could be wrong.

Regardless of this. I think it's better to write to clearly requires support for both sanitizers given that it's pretty easy to do.

yln marked 6 inline comments as done.Jan 23 2019, 2:31 PM
yln added inline comments.
compiler-rt/test/asan/TestCases/ubsan_noreturn_compatibility.c
3 ↗(On Diff #181394)

I tested ninja clean && ninja check-asan which does *not* build the ubsan runtime, but this test still passes.

llvm/docs/LangRef.rst
1464

Much clearer. Thanks!

yln marked an inline comment as done.Jan 23 2019, 2:32 PM
yln marked 2 inline comments as done.Jan 23 2019, 3:13 PM
yln updated this revision to Diff 183200.Jan 23 2019, 3:22 PM

Refine wording in docs.
Move test to ubsan/TestCases and mark with REQUIRES: ubsan-asan.
Add test for BitcodeReader/Writer.

yln added a comment.Jan 23 2019, 3:52 PM

This patch is missing tests for the new attribute in AsmParser / BitcodeReader / BitcodeWriter.

Are the additions in llvm/test/Bitcode/attributes.ll sufficient? Please point me to the appropriate places, if additional tests are required.

call void @__asan_handle_no_return    // Additionally inserted to avoid false positives
call void @longjmp
call void @__asan_handle_no_return
call void @__ubsan_handle_builtin_unreachable

Yes, the second one is redundant. Is that because the ubsan call has noreturn attribute? Does it also have !nosanitize, and would it help in this case?

Yes the call to ubsan_unreachable is marked both noreturn and !nosanitize. However, I am not sure if we can simply ignore calls marked with !nosanitize. My assumption is that we still have to unpoison the stack so that __ubsan_handle_builtin_unreachable can unwind and print a stack trace without a false positive from ASan. Do you know if my assumption holds or is it unnecessary?
My first idea would be to insert calls to __asan_handle_no_return only once per basic block.

I will address this in a separate patch since doing this optimization is an orthogonal pre-existing issue.

That should not be necessary.
__asan_handle_noreturn is needed for functions that move SP without going through ASan epilogue, in order to maintain the requirement that stack below SP has clean shadow.
Ubsan-rt does nothing of the sort.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 23 2019, 5:07 PM
This revision was automatically updated to reflect the committed changes.

Please revert this.
First, this wasn't reviewed.
Second, the lists weren't subscribed.

yln reopened this revision.Jan 24 2019, 10:12 AM

Please revert this.
First, this wasn't reviewed.
Second, the lists weren't subscribed.

I apologize for this. It was not my intention to land the revision without formal acceptance.

commit-lists:
I prepared this patch via the monorepo and did not select a repository in Phabricator because the changes span multiple repos. This means that I have to manually ensure that the correct lists are subscribed in the Phabricator web interface, correct?

In D56624#1369626, @yln wrote:

Please revert this.
First, this wasn't reviewed.
Second, the lists weren't subscribed.

I apologize for this. It was not my intention to land the revision without formal acceptance.

commit-lists:
I prepared this patch via the monorepo and did not select a repository in Phabricator because the changes span multiple repos.

I guess rL would be a safe default to make.

This means that I have to manually ensure that the correct lists are subscribed in the Phabricator web interface, correct?

Looks like the rules to subscribe the lists based on a content of the patch are still not added, so i'd say yes.

That being said, the normal practice in such situations is to open a new review.

Also, i suspect this should be split up into at least two parts - the new LLVM IR expect_noreturn attribute, and the rest.

vsk added a subscriber: vsk.Jan 24 2019, 10:29 AM

What are the advantages of a generalized expect_noreturn attribute, vs. a narrower attribute or intrinsic? The expect_noreturn semantics do not provide strong guarantees, and are not really orthogonal from the pre-existing cold attribute. In particular, expect_noreturn doesn't even seem strong enough to allow ASan to unpoison its stack.

Apologies if discussion has shifted elsewhere -- I'd be happy to chime in on the new review.

yln added a comment.Jan 24 2019, 11:08 AM

@lebedev.ri
Thanks for the clarifications!
I will split this up into multiple patches once we settled on a design.

In D56624#1369635, @vsk wrote:

What are the advantages of a generalized expect_noreturn attribute, vs. a narrower attribute or intrinsic? The expect_noreturn semantics do not provide strong guarantees, and are not really orthogonal from the pre-existing cold attribute.

@eugenis Do you want to chime in here?
I think they convey different meanings even if their treatment by the optimizer is similar. The cold attribute says nothing about whether or not a function is expected to return.

In particular, expect_noreturn doesn't even seem strong enough to allow ASan to unpoison its stack.

I am not sure I understand this part. Can you elaborate?

For context:

``cold``
  This attribute indicates that this function is rarely called. When
  computing edge weights, basic blocks post-dominated by a cold
  function call are also considered to be cold; and, thus, given low
  weight.
``noreturn``
  This function attribute indicates that the function never returns
  normally. This produces undefined behavior at runtime if the
  function ever does dynamically return.
``expect_noreturn``
  This function attribute indicates that the function is unlikely to return
  normally, but that it is still allowed to do so. This is useful in cases
  for which ``noreturn`` is too strong a guarantee.
yln added a reviewer: vsk.Jan 24 2019, 11:09 AM
vsk added a comment.Jan 24 2019, 11:21 AM
In D56624#1369767, @yln wrote:
In D56624#1369635, @vsk wrote:

What are the advantages of a generalized expect_noreturn attribute, vs. a narrower attribute or intrinsic? The expect_noreturn semantics do not provide strong guarantees, and are not really orthogonal from the pre-existing cold attribute.

@eugenis Do you want to chime in here?
I think they convey different meanings even if their treatment by the optimizer is similar. The cold attribute says nothing about whether or not a function is expected to return.

That's my point: it doesn't need to, because it's orthogonal. It's just a hint that a call is cold and could be profitable to split/reorder. Features of llvm IR generally try to be orthogonal to reduce complexity in the optimizer.

In particular, expect_noreturn doesn't even seem strong enough to allow ASan to unpoison its stack.

I am not sure I understand this part. Can you elaborate?

Because "expect_noreturn" calls are allowed to return, the compiler must behave as they could. In particular, this means that unpoisoning the stack before expect_noreturn calls (given the current semantics) is premature.

Put another way, a frontend author may (understandably, but mistakenly!) attach expect_noreturn to calls which they expect to be cold. That would regress ASan coverage.

yln added a comment.Jan 24 2019, 12:39 PM

Note that all of this currently only matters when compiling with -fsanitize=unreachable. The following discussion is within the context of the current implementation: UBSan removes the noreturn so it can instrument unreachable without the added instrumentation being optimized away. Maybe we should take a step back and ask if that is the right approach at all?

In D56624#1369795, @vsk wrote:

Because "expect_noreturn" calls are allowed to return, the compiler must behave as they could. In particular, this means that unpoisoning the stack before expect_noreturn calls (given the current semantics) is premature.

Put another way, a frontend author may (understandably, but mistakenly!) attach expect_noreturn to calls which they expect to be cold.

I think about this differently. Yes, most noreturn functions are also cold, e.g., abort, but not necessarily, e.g., calls to longjmp do not necessarily have to be. Why would it be okay to attach expect_noreturn instead of cold? Why do we think that this is an easy-to-make mistake? Have people accidentally put noreturn on cold functions before?
Can we agree on the following?
"It is orthogonal on the language level, but seems to be redundant in terms of the optimizer. Since LLVM IR's main purpose it support the optimizer, this is a strong argument against the general purpose attribute."

That would regress ASan coverage.

You talk specifically about cases of misuses of the attribute, right?
In the context of the current issue with UBSan the possibility for false negative is not too much of a regression: it only occurs when UBSan is going to diagnose an "unreachable error" anyways.

So the main point is whether or not to use a "general purpose" attribute or a "narrow purpose" attribute/intrinsic. My understanding is that you list the following points as arguments against general purpose. Is my understanding accurate?

  1. Potential misuse can regress ASan coverage
  2. Complicates optimizer

Narrow purpose: No potential misuses, and optimizer can simply ignore it.

Initially I proposed a narrow purpose attribute, but while iterating on this revision changed it to be general purpose. @eugenis
Now, I have a slight preference for general purpose: I don't think 1. is a big issue (but then again, I have no experience here), and 2. it is always correct for the optimizer to continue ignoring the attribute (current status).
Actually, 2. also encompasses the potential upside: a more complicated optimizer that takes advantage of the attribute to do additional optimizations.

vsk added a comment.Jan 24 2019, 1:03 PM
In D56624#1369940, @yln wrote:

Note that all of this currently only matters when compiling with -fsanitize=unreachable. The following discussion is within the context of the current implementation: UBSan removes the noreturn so it can instrument unreachable without the added instrumentation being optimized away. Maybe we should take a step back and ask if that is the right approach at all?

In D56624#1369795, @vsk wrote:

Because "expect_noreturn" calls are allowed to return, the compiler must behave as they could. In particular, this means that unpoisoning the stack before expect_noreturn calls (given the current semantics) is premature.

Put another way, a frontend author may (understandably, but mistakenly!) attach expect_noreturn to calls which they expect to be cold.

I think about this differently. Yes, most noreturn functions are also cold, e.g., abort, but not necessarily, e.g., calls to longjmp do not necessarily have to be. Why would it be okay to attach expect_noreturn instead of cold?

It would be okay by definition, because it would be allowed by the proposed IR semantics.

Why do we think that this is an easy-to-make mistake?

I don't think that's the right question. Rather, we should ask: why is it acceptable to define semantics in a way that makes the mistake possible?

My thinking on this is: it's not acceptable, because a narrower change (say, introducing a sanitizer_noreturn attribute) would address the issue without as much potential for abuse.

Can we agree on the following?
"It is orthogonal on the language level, but seems to be redundant in terms of the optimizer. Since LLVM IR's main purpose it support the optimizer, this is a strong argument against the general purpose attribute."

I'm making a more neutral point: that expect_noreturn conflates different concerns -- optimization and sanitizer correctness. I'm not making a claim about what the main purpose of IR is.

That would regress ASan coverage.

You talk specifically about cases of misuses of the attribute, right?
In the context of the current issue with UBSan the possibility for false negative is not too much of a regression: it only occurs when UBSan is going to diagnose an "unreachable error" anyways.

So the main point is whether or not to use a "general purpose" attribute or a "narrow purpose" attribute/intrinsic. My understanding is that you list the following points as arguments against general purpose. Is my understanding accurate?

  1. Potential misuse can regress ASan coverage
  2. Complicates optimizer

Narrow purpose: No potential misuses, and optimizer can simply ignore it.

Yes, I think this is a fair summary, thanks :).

Initially I proposed a narrow purpose attribute, but while iterating on this revision changed it to be general purpose. @eugenis
Now, I have a slight preference for general purpose: I don't think 1. is a big issue (but then again, I have no experience here),

Changes to the IR semantics have hard-to-predict ripple effects on many, many other projects. It pays to be conservative in this area.

and 2. it is always correct for the optimizer to continue ignoring the attribute (current status).
Actually, 2. also encompasses the potential upside: a more complicated optimizer that takes advantage of the attribute to do additional optimizations.

I'm having a hard time thinking of any optimizations based on expect_noreturn which aren't already enabled by the cold attribute. What do you have in mind?

Because "expect_noreturn" calls are allowed to return, the compiler must behave as they could. In particular, this means that unpoisoning the stack before expect_noreturn calls (given the current semantics) is premature.

I don't think that's true. A hypothetical function

maybe_longjmp(jmp_buf env)

that checks an opaque condition needs stack unpoisoning before the call, in the absense of a better solution.

One possible optimization that I can think of is splitting code after the call into a separate basic block and marking it as cold.
Admittedly, that's unlikely to have big impact in practice. I'd guess that [[expect_noreturn]] calls are typically not very hot in the first place.

vsk added a comment.Jan 24 2019, 2:51 PM

Because "expect_noreturn" calls are allowed to return, the compiler must behave as they could. In particular, this means that unpoisoning the stack before expect_noreturn calls (given the current semantics) is premature.

I don't think that's true. A hypothetical function

maybe_longjmp(jmp_buf env)

that checks an opaque condition needs stack unpoisoning before the call, in the absense of a better solution.

Wouldn’t it be preferable to unpoison the stack inside of maybe_longjmp, once the opaque condition can be checked? Even if not, a narrower sanitizer_noreturn attribute is still perfectly fine, here.

One possible optimization that I can think of is splitting code after the call into a separate basic block and marking it as cold.
Admittedly, that's unlikely to have big impact in practice. I'd guess that [[expect_noreturn]] calls are typically not very hot in the first place.

The cold attribute is already used for this kind of splitting/reordering. I don't yet see how expect_noreturn creates new opportunities for the optimizer.

Wouldn’t it be preferable to unpoison the stack inside of maybe_longjmp, once the opaque condition can be checked?

Sure, but that's not always possible. That's why we have interceptors.

One possible optimization that I can think of is splitting code after the call into a separate basic block and marking it as cold.
Admittedly, that's unlikely to have big impact in practice. I'd guess that [[expect_noreturn]] calls are typically not very hot in the first place.

The cold attribute is already used for this kind of splitting/reordering. I don't yet see how expect_noreturn creates new opportunities for the optimizer.

Strictly speaking, cold attribute on a function means that it is rarely called. It does not say anything about the code after the call being colder than the code before the call (within the same BB), which makes splitting the BB pointless.

Anyway, I agree that the arguments [[expect_noreturn]] are not that strong and perhaps don't make the bar for the addition of a new IR attribute.
Should we go back to the intrinsic idea?

vsk added a comment.Jan 24 2019, 4:18 PM

Wouldn’t it be preferable to unpoison the stack inside of maybe_longjmp, once the opaque condition can be checked?

Sure, but that's not always possible. That's why we have interceptors.

Fair enough!

One possible optimization that I can think of is splitting code after the call into a separate basic block and marking it as cold.
Admittedly, that's unlikely to have big impact in practice. I'd guess that [[expect_noreturn]] calls are typically not very hot in the first place.

The cold attribute is already used for this kind of splitting/reordering. I don't yet see how expect_noreturn creates new opportunities for the optimizer.

Strictly speaking, cold attribute on a function means that it is rarely called. It does not say anything about the code after the call being colder than the code before the call (within the same BB), which makes splitting the BB pointless.

That's true, but it's safe to assume that code which dominates the cold call (or is post-dominated by it) is at least as cold as the call.

Anyway, I agree that the arguments [[expect_noreturn]] are not that strong and perhaps don't make the bar for the addition of a new IR attribute.
Should we go back to the intrinsic idea?

Sgtm, I think that'd be the simplest solution (something like inserting llvm.asan.stack_unpoison() where needed).

yln added a comment.Jan 24 2019, 5:11 PM

Seems as if we reached consensus! :) I will change the revision to use an intrinsic.

Before I start doing that, just one more quick idea:
Would it work if UBsan directly inserts calls to __asan_handle_no_return (of course only when ASan is requested). Similar to how it inserts calls to it's own runtime functions (e.g., __ubsan_handle_builtin_unreachable).
If we strive for the "simplest" solution... but maybe I am missing something in this is too simple?

Maybe the frontend should insert __asan_handle_noreturn whenever ASan is enabled, and then ASan would not care about the attribute? I'd like to avoid having this logic in two places.

yln added a comment.Jan 24 2019, 5:31 PM

Maybe the frontend should insert __asan_handle_noreturn whenever ASan is enabled, and then ASan would not care about the attribute? I'd like to avoid having this logic in two places.

+1 for this. @vsk Can you sign off on this design?

vsk added a comment.Jan 24 2019, 7:09 PM
In D56624#1370607, @yln wrote:

Maybe the frontend should insert __asan_handle_noreturn whenever ASan is enabled, and then ASan would not care about the attribute? I'd like to avoid having this logic in two places.

+1 for this. @vsk Can you sign off on this design?

Sounds good to me.

yln abandoned this revision.Jan 25 2019, 6:30 PM

Created new revision for this change: https://reviews.llvm.org/D57278