This is an archive of the discontinued LLVM Phabricator instance.

[asan] Unpoison signal alternate stack.
ClosedPublic

Authored by robot on Mar 28 2020, 6:09 AM.

Details

Summary

Before unwinding the stack, __asan_handle_no_return is supposed to
unpoison the entire stack - that is, remove the entries in the shadow
memory corresponding to stack (e.g. redzone markers around variables).
This does not work correctly if __asan_handle_no_return is called from
the alternate stack used in signal handlers, because the stack top is
read from a cache, which yields the default stack top instead of the
signal alternate stack top.

It is also possible to jump between the default stack and the signal
alternate stack. Therefore, __asan_handle_no_return needs to unpoison
both.

Diff Detail

Event Timeline

robot created this revision.Mar 28 2020, 6:09 AM
Herald added a subscriber: Restricted Project. · View Herald TranscriptMar 28 2020, 6:09 AM
robot edited the summary of this revision. (Show Details)Mar 28 2020, 6:13 AM
robot edited the summary of this revision. (Show Details)
robot updated this revision to Diff 254488.Apr 2 2020, 4:43 AM

Rebased, slightly better error reporting in test.

robot updated this revision to Diff 255325.Apr 6 2020, 7:22 AM

Fix clang-tidy issues. This cause inconsistency in code style in asan (snake case vs camel case), because I did not want to touch all the asan code in this patch.

eugenis added inline comments.Apr 9 2020, 1:06 PM
compiler-rt/lib/asan/asan_rtl.cpp
620 ↗(On Diff #255325)

I do not think this is true - false positive reports are only possible if we fail the kMaxExpectedCleanupSize check below.
Remove this statement?

compiler-rt/lib/sanitizer_common/sanitizer_common.h
331 ↗(On Diff #255325)

I want to say that this will break compilation on fuchsia and rtems because you only have definitions for posix + win, but then we have the same situation with GetThreadStackAndTls on fuchsia, and things are somehow fine. Any idea why? Consider adding an inline implementation in the header under the complementary condition (i.e. not-posix).

robot updated this revision to Diff 256849.Apr 12 2020, 5:41 AM
robot marked 2 inline comments as done.

Refactored the way stack bounds are acquired in __asan_handle_no_return. This should fix compilation on Fuchsia and RTEMS.

compiler-rt/lib/asan/asan_rtl.cpp
620 ↗(On Diff #255325)

You sent me down a rabbit hole there :)
My initial reasoning was indeed flawed, but I still think a false positive is possible in the context. Please see the comment I've added to the source.

compiler-rt/lib/sanitizer_common/sanitizer_common.h
331 ↗(On Diff #255325)

From a bit of experimentation, I think the dead branches are dropped even at -O0. Therefore there's never a symbol reference for the linker to resolve.

Regarding Fuchsia and RTEMS, I've pondered a bit on where to put things, and came up with another approach. It's more intrusive but avoids noop-functions. Please let me know what you think.

robot updated this revision to Diff 257993.Apr 16 2020, 2:11 AM

Improve portability by supplying fd=-1 to mmap.

robot updated this revision to Diff 258685.Apr 20 2020, 2:22 AM

Fix Windows and MacOS builds:

  • #ifdef SANITIZER_POSIX -> #if SANITIZER_POSIX (it's always defined)
  • don't address sigaction::sa_restorer member, just let it be implicitly zero-initialized
  • use sigemptyset to clear the signal mask
robot added a comment.May 22 2020, 6:07 AM

Would be great if you could have another look.

Sorry for the delay.

I find this very hard to read. You add 150 lines to asan_rtl.cpp that do basically nothing. There is maybe 10 lines of the actual logic. I wonder if the same could be expressed in less code, and especially in fewer functions?

kcc added a comment.Jun 4 2020, 1:53 PM

also, please avoid #ifdefs.
OS-specific code should go to an OS-specific file.

vitalybuka added inline comments.Jun 4 2020, 5:55 PM
compiler-rt/lib/asan/asan_rtl.cpp
696 ↗(On Diff #258685)

Please, extract into a single NFC patch the code
you move existing code for stack calculation into a new function with
common for all platforms. e.g. like the getCurrentStackBounds
and move the current relevant code there. Please put implementation into relevant platform specific files. This will help us avoid ifdef

Then you can rebase D76986 so it will be about behavior changes only without any refactoring.

743 ↗(On Diff #258685)

historically code in sanitizer_common and other sanitizers follow Google C++ style, not LLVM
please keep it consistent.

compiler-rt/lib/sanitizer_common/sanitizer_posix.h
123 ↗(On Diff #258685)

Code should try to match style around. These comments looks very different from what is on the rest of sanitizer_common

compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp
212 ↗(On Diff #258685)

this easily can be compiled into libc memset interceptors, that's why we use
internal_memset(&sigact, 0, sizeof(sigact)); around

compiler-rt/test/asan/TestCases/unpoison-alternate-stack.cpp
56

I guess all above is irrelevant to the test?
In most case here just ignoring the return code is appropriate solution.

89

it could be more readable if you inline both these functions

robot updated this revision to Diff 268874.Jun 5 2020, 10:37 AM
robot marked 6 inline comments as done.

Split refactoring into separate commit.

robot added a comment.Jun 5 2020, 10:40 AM

I'm trying to split this into two changes, please bear with my while I'm trying to figure out Phabricator & Arcanist.

compiler-rt/lib/asan/asan_rtl.cpp
696 ↗(On Diff #258685)

That's a great idea, thanks! I've reduced the refactoring to a small amount, unfortunately I didn't find a better solution than to introduce customization points for RTEMS, Fuchsia, POSIX and Windows. Logically speaking, those all behave differently, but only slightly. I've struggled with making this difference explicit without duplicating code.

743 ↗(On Diff #258685)

I see. I was trying to satisfy Harbormaster's clang-format checks, which complained about my original code and suggested LLVM style. I've tried to adapt.

compiler-rt/lib/sanitizer_common/sanitizer_posix.h
123 ↗(On Diff #258685)

I've seen some doxygen in compiler-rt/lib (e.g. sanitizer_common), so I was trying to use it for the new code.
Reverted to plain C comments, hope that's what you meant?

compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp
212 ↗(On Diff #258685)

That's good to know, thanks! I'm not sure though which patterns I should replace with a memset. The code now looks very different anyway, please have a look.

compiler-rt/test/asan/TestCases/unpoison-alternate-stack.cpp
56

In other projects, I encountered intermittent faults in CI / push barriers / continuous testing when using system calls - most likely caused by massive parallelism on the test machine. A test would fail or time out, which is hard to reproduce locally. Any error from functions like mmap or pthread_create are not test failures due to a bug in asan, but they are infrastructure issues.

So yes, I can certainly remove the error checks without affecting the test itself, but I fear that this decreases the confidence that test failures indicate a bug in asan.

89

I'm not sure if I can inline prepareStackShadow, from a logical point of view:
I want to test that the redzones created by the local variable Blob in prepareStackShadow are gone.
But if I place the variable into testHandeNoReturn, then its redzones should logically still be there in the catch clause, because the variable is still accessible.
In practice, the redzones should have been deleted because the throw calls __asan_handle_no_return which removes all redzones on the stack. However, I'm not sure if it's good if the test would rely on this behaviour.

I can inline assertNoRedzones, no objections.

robot updated this revision to Diff 268878.Jun 5 2020, 10:44 AM

Trying to split change into two commits.

robot added a comment.Jun 5 2020, 10:48 AM

@eugenis I've tried to minimize impact. Hope it's better now.
@kcc I've left most of the logic now as it was originally, not in an OS-specific file but using plain ifs (not ifdefs). Extracted a customization point for POSIX, however I could only come up with a solution that defines GetStackBounds in four different files.

vitalybuka added inline comments.Jun 5 2020, 3:44 PM
compiler-rt/lib/asan/asan_posix.cpp
45

most of this comment is more relevant to GetStackBounds than to this function
also it's explains what I already see from the code.
I'd refer you remove it. Maybe move some into patch description.

79

This function looks inconsistent. Sometimes it returns current stack, sometimes the given stack from curr_thread.
Please remove argument from here and move GetCurrentThread(); into GetStackBoundsFromCacheOrOs

compiler-rt/lib/asan/asan_rtems.cpp
67 ↗(On Diff #268878)

GetStackBounds -> GetCurrentStackBounds

compiler-rt/test/asan/TestCases/unpoison-alternate-stack.cpp
105

most comments here and in the lib are quite redundant

112

the test does not cover the case when we throw on default stack but sigaltstack is set.

@eugenis WDTY?
I'd like to ask you to always unpoison both: default and altstack from __asan_handle_no_return.
Currently asan_handle_no_return is just a bailout for exceptions. I don't remember that we tried to avoid false negatives after them. Only that it's not not going to repot false positives. So it should be fine.
Also in case of fake_stack it may unpoison both already.

155

Not sure that all compilers will accepts designated initializers in C++ mode. Runtime is compiled and tested with gcc as well

eugenis added inline comments.Jun 5 2020, 3:54 PM
compiler-rt/test/asan/TestCases/unpoison-alternate-stack.cpp
112

I don't see why we would do that. It's true that we are more concerned with correctness than completeness, but we should not unpoison stuff that we definitely do not need to.

FakeStack is different - it is logically interleaved with the main stack.

eugenis added inline comments.Jun 5 2020, 3:58 PM
compiler-rt/test/asan/TestCases/unpoison-alternate-stack.cpp
112

Hmm, wait, it's possible to longjmp from altstack to the main stack, is it not?
Does it mean we need to unpoison both, at least when the current stack == altstack?

robot updated this revision to Diff 269290.Jun 8 2020, 10:53 AM
robot marked 9 inline comments as done.

Unpoison both stacks always.

To support jumps between stacks, always unpoison both the signal
alternate stack and the default stack.

  • [asan] Refactor fetching of stack boundaries
  • [asan] Unpoison signal alternate stack.
robot added a comment.Jun 8 2020, 10:54 AM

Now unpoisoning both stacks, always.

compiler-rt/test/asan/TestCases/unpoison-alternate-stack.cpp
105

lib meaning the asan library (compiler-rt/lib)?
I've removed the last big comment there on your request.
Here in the test I'm trying to explain why the test setup does its steps.

112

Ah, yes, I've even seen this in production code. It's probably possible to jump from main stack to altstack as well, so I've tried to use the simplest solution and unpoison both stacks always.

155

Ah, I was trying to use the C++20 feature but didn't realize the flag is not set in the test command line. I've chosen to set the flag explicitly, but I can also replace the designated initializers of course.

robot edited the summary of this revision. (Show Details)Jun 8 2020, 10:56 AM
robot updated this revision to Diff 269521.Jun 9 2020, 6:20 AM

Formatting.

  • [asan] Refactor fetching of stack boundaries.
  • [asan] Unpoison signal alternate stack.

Formatting.

  • [asan] Refactor fetching of stack boundaries.

can you please move this into a separate review? It's just another NFC patch, no new tests will be needed.

compiler-rt/lib/asan/asan_internal.h
86 ↗(On Diff #269521)

I'd avoid the list of platforms here as it's may because outdated with time.

compiler-rt/lib/asan/asan_rtl.cpp
568 ↗(On Diff #269521)

please don't mix refactoring with functional changes in the same patch.
if you want to rename something create another patch.
However, here new naming conversion does not match code in this file, so it should not be renamed like this at all.

578 ↗(On Diff #269521)
if (bounds.top == bounds.bottom)
  return;

and you can remove have_bounds

613 ↗(On Diff #269521)
// everywhere
void PlatformUnpoisonStacks() {
  return false; 
}

// linux
void PlatformUnpoisonStacks() {
  return true or false; 
}

void NOINLINE __asan_handle_no_return() {
  if (asan_init_is_running)
    return;

  if (!PlatformUnpoisonStacks())
     UnpoisonStacks();
  UnpoisonCurrentFakeStack();
}

This way only PlatformUnpoisonStacks needs to be in headers and you can drop StackBounds at all
I would not bother to try to get "WARNING: ASan is ignoring requested" from PlatformUnpoisonStacks

Still request about separate refactoring patch is relevant.

if you like to avoid the new patch then we can do the following:

void NOINLINE __asan_handle_no_return() {
  if (asan_init_is_running)
      return;

  if (!PlatformUnpoisonStacks()) {
   // entire block of existing code
   int local_stack;
   ...
   PoisonShadow(bottom, top - bottom, 0);
  }

  if (curr_thread && curr_thread->has_fake_stack())
    curr_thread->fake_stack()->HandleNoReturn();
}
743 ↗(On Diff #258685)

please use clang-format with --style=file

compiler-rt/lib/asan/asan_win.cpp
273 ↗(On Diff #269521)

I would prefer we do not let platform to choose not to call UnpoisonCurrentFakeStack();

compiler-rt/test/asan/TestCases/unpoison-alternate-stack.cpp
56

it's applies to many other tests, but we don't overload them error handling logic

robot updated this revision to Diff 269862.Jun 10 2020, 8:40 AM
robot marked 6 inline comments as done.

Extracted refactoring into D81577

compiler-rt/lib/asan/asan_rtl.cpp
568 ↗(On Diff #269521)

Sorry, the renaming was not intentional. I'm running part of the CI linter script locally, and this performed a git-clang-format - essentially what Harbormaster is doing. Will undo.

I misunderstood the request for an NFC change and split the refactoring into its own commit in the same phabricator diff. Will create a new diff out of the refactoring.

613 ↗(On Diff #269521)

There are two ways to unpoison the default stack in POSIX case:

  • If we're on the sigalt stack, try cache first, fall back to asking OS.
  • If we're not on the sigalt stack, never use cache.

I've chosen to put the first into UnpoisonStack and inline the second one into PlatformUnpoisonStacks. However, this also suppresses any warning for the default stack bounds in case we're on the sigalt stack.
It is also questionable if we should ever use the cache on Linux: I've left the problem with SS_AUTODISARM unsolved, so in case we're on the sigalt stack but with SS_AUTODISARM, we'll attempt to use the cache and might produce a warning. If we wouldn't use the cache, we'd properly unpoison the default stack.

compiler-rt/test/asan/TestCases/unpoison-alternate-stack.cpp
56

Not sure which error handling is relevant, I've removed all error handling now.

robot marked an inline comment as done.Jun 10 2020, 9:11 AM
robot added inline comments.
compiler-rt/lib/asan/asan_posix.cpp
51

In order to properly solve the case "on sigalt stack but SS_AUTODISARM was used", we could intercept sigaltstack and register the signal alternate stack with the thread object in asan. Maybe in another revision.

vitalybuka accepted this revision.Jun 10 2020, 6:50 PM

LGTM with few changes

compiler-rt/lib/asan/asan_posix.cpp
43–67

don't need on_sig_alt_stack

just

if (signal_stack.ss_flags != SS_ONSTACK)
  return false;
66

Please in the first patch extract warning block and call it here (and above)

This revision is now accepted and ready to land.Jun 10 2020, 6:50 PM
robot updated this revision to Diff 270238.Jun 11 2020, 2:26 PM
robot marked 2 inline comments as done.

Simplify logic, warn about unlikely stack bounds in PlatformUnpoisonStacks.

robot marked 4 inline comments as done.Jun 11 2020, 2:28 PM
vitalybuka accepted this revision.Jun 12 2020, 7:39 PM

Do you need me to land these patches?

compiler-rt/test/asan/TestCases/unpoison-alternate-stack.cpp
1

This test should be under compiler-rt/test/asan/TestCases/Posix or even compiler-rt/test/asan/TestCases/Linux

robot marked an inline comment as done.Jun 13 2020, 2:41 AM

Do you need me to land these patches?

Yes, please. I don't have commit rights.

robot updated this revision to Diff 270568.Jun 13 2020, 2:41 AM

Move test to posix subfolder, since it requires posix features.

This revision was automatically updated to reflect the committed changes.