This is an archive of the discontinued LLVM Phabricator instance.

[asan darwin] Allow clients to implement `__sanitizer_report_error_summary`
AcceptedPublic

Authored by dmaclach on Feb 26 2023, 10:16 AM.

Details

Summary

__sanitizer_report_error_summary is declared llvm/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_interface_internal.h as being able to be overridden by the client. On darwin the sanitizer runtime uses this symbol to find references to the sanitizer libraries, so if you override it you end up with the error =ERROR: Interceptors are not working. This may be because AddressSanitizer is loaded too late (e.g. via dlopen). Please launch the executable with: at launch time.

Replace uses of __sanitizer_report_error_summary for finding the sanitizer libraries with using the address of a local function.

Diff Detail

Event Timeline

dmaclach created this revision.Feb 26 2023, 10:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 26 2023, 10:16 AM
Herald added a subscriber: Enna1. · View Herald Transcript
dmaclach requested review of this revision.Feb 26 2023, 10:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 26 2023, 10:16 AM
dmaclach edited the summary of this revision. (Show Details)Feb 26 2023, 10:59 AM
yln added a comment.EditedFeb 27 2023, 3:00 PM

Thank you @dmaclach!

The explanation of the problem und fix make sense to me. Can we add a test for this in sanitizer_common?

Note, I refactored this code in D129157, but the choice of __sanitizer_report_error_summary predates that: D129157#4156731

LGTM, under the condition that we add a test for this in sanitizer_common.

b/llvm/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp
991 ↗(On Diff #500599)

"__sanitizer_print_stack_trace" is an exported function, but not overridable by the client.

yln accepted this revision.Feb 27 2023, 3:03 PM
This revision is now accepted and ready to land.Feb 27 2023, 3:03 PM
dmaclach updated this revision to Diff 500945.Feb 27 2023, 3:37 PM

This diff should be buildable. Adding requested test shortly.

Herald added a subscriber: Restricted Project. · View Herald TranscriptFeb 27 2023, 3:37 PM
vitalybuka added a subscriber: vitalybuka.

Adding myself to review the test and landing the patch.

This revision now requires review to proceed.Feb 27 2023, 3:40 PM
dmaclach updated this revision to Diff 500967.Feb 27 2023, 4:34 PM
dmaclach edited the summary of this revision. (Show Details)

Switched over to using "sanitizer_sandbox_on_notify" from "sanitizer_print_stack_trace" because it's available in the internal headers and serves the same purpose.

Switched over to using "sanitizer_sandbox_on_notify" from "sanitizer_print_stack_trace" because it's available in the internal headers and serves the same purpose.

This is also weak symbol which is expected to be redefined by user, like original __sanitizer_report_error_summary

Switched over to using "sanitizer_sandbox_on_notify" from "sanitizer_print_stack_trace" because it's available in the internal headers and serves the same purpose.

This is also weak symbol which is expected to be redefined by user, like original __sanitizer_report_error_summary

actually it's defined by sanitizer, not sure why this is weak

What is wrong with __sanitizer_print_stack_trace? Seems like more stable bet.

yln added a comment.Feb 28 2023, 11:56 AM

[... ] __sanitizer_print_stack_trace? Seems like more stable bet.

+1

[... ] __sanitizer_print_stack_trace? Seems like more stable bet.

+1

__sanitizer_print_stack_trace is declared in compiler-rt/include/sanitizer/common_interface_defs.h which isn't on the include path for compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp. If I include it I get an error error: typedef redefinition with different types ('struct __sanitizer_sandbox_arguments' vs 'struct __sanitizer_sandbox_arguments') which is accurate because struct __sanitizer_sandbox_arguments is declared in both common_interface_defs.h and sanitizer_internal_interface.h.

If I add a declaration for __sanitizer_print_stack_trace to sanitizer_internal_interface.h I end up with

Undefined symbols for architecture arm64:
  "___sanitizer_print_stack_trace", referenced from:
      __sanitizer::InitializePlatformEarly() in sanitizer_mac.cpp.o

when attempting to link lib/clang/17/lib/darwin/libclang_rt.stats_osx_dynamic.dylib. So it looks like I want to test against a symbol that is defined inside of compiler-rt/lib/sanitizer_common and __sanitizer_sandbox_on_notify seemed like a reasonable candidate.

Thoughts?

yln added a comment.Mar 2 2023, 11:19 AM

Taking a step back, please check my understanding:

In VerifyInterceptorsWorking() we want to check that interceptors are setup correctly by making sure that the dlsym("puts")+dladdr() lookup for a function that we know is intercepted actually returns it's interceptor (wrap_puts()).
For the second part, we don't actually check the exact interceptor function but rather that the interceptor is defined in our own sanitizer runtime image. We lookup our own image by using dladdr() on a function that we define, but can't be overridden by the user (current bug that we are fixing here didn't consider this and we chose a function that actually can be overridden by the user program).

So if we keep using this approach, we want to find a function that is:

  • Defined in the sanitizer runtime
  • Not overridable
  • Can be looked up via dladdr() (Can you find out and explain here what exactly the rules around this are? Does the function need to be exported?)

It sounds like an easy, stable solution would be to define a dummy function like __sanitizer_common_module_lookup_dummy() that fulfills all the requirements and just use that instead of picking a "random", already existing function.

Also, I am not sure why we do the detour to go to the image and then compare image names. Maybe it's worth exploring if we can do something more direct, e.g., something in the spirit of dlsym("puts") == &wrap_puts?

yln requested changes to this revision.Mar 2 2023, 11:20 AM
This revision now requires changes to proceed.Mar 2 2023, 11:20 AM
dmaclach updated this revision to Diff 501966.Mar 2 2023, 1:43 PM
dmaclach edited the summary of this revision. (Show Details)

Taking a step back, please check my understanding:

So if we keep using this approach, we want to find a function that is:

  • Defined in the sanitizer runtime
  • Not overridable
  • Can be looked up via dladdr() (Can you find out and explain here what exactly the rules around this are? Does the function need to be exported?)

Thank you for making me think about it. dladdr() looks up an address...not a function. So we can use any function that we can get the address of (such as the function we are in).

Also, I am not sure why we do the detour to go to the image and then compare image names. Maybe it's worth exploring if we can do something more direct, e.g., something in the spirit of dlsym("puts") == &wrap_puts?

wrap_puts isn't defined. It's being called wrap_puts in the comment because it's our "puts" that is being linked in in place of the system puts. Hopefully the solution that we have now is acceptable.

ninja check-asan on darwin gives me:

Testing Time: 769.12s
  Unsupported      :  455
  Passed           : 1033
  Expectedly Failed:    2
yln accepted this revision.Mar 2 2023, 2:59 PM

Okay, so dladdr() just takes an address and should always work irrespective if the function that lives at that address was exported etc., yes?

LGTM, please add a test for the original bug: user program overriding __sanitizer_report_error_summary.

dmaclach updated this revision to Diff 502684.Mar 6 2023, 9:16 AM

Added test in asan. There wasn't an easy spot to put it in the sanitizer_common tests because I wanted to verify that the actual sanitizer libraries were calling through correctly when __sanitizer_report_error_summary was overridden.

vitalybuka updated this revision to Diff 502857.Mar 6 2023, 4:37 PM

simplify test

vitalybuka accepted this revision.Mar 6 2023, 4:37 PM
This revision is now accepted and ready to land.Mar 6 2023, 4:37 PM
This revision was landed with ongoing or failed builds.Mar 6 2023, 4:38 PM
This revision was automatically updated to reflect the committed changes.
usama54321 reopened this revision.EditedMar 10 2023, 7:17 AM
usama54321 added a subscriber: usama54321.

This test is failing on ios arm64e. It seems like the function is not getting overridden on ios.

This revision is now accepted and ready to land.Mar 10 2023, 7:17 AM
rsundahl added inline comments.Mar 10 2023, 8:27 AM
compiler-rt/test/asan/TestCases/report_error_summary.cpp
4
// Required for dyld macOS 12.0+
#if (__APPLE__)
__attribute__((weak))
#endif

You'll need to add this if you want dyld64 to see this override of __sanitizer_report_error_summary(). dyld64 will not consider and open a dylib during weak def coalescing unless there is at least one weak symbol in it. It doesn't have to be the intended "stronger" symbol, it could be any symbol like "foo", the module just has to have at least one.

rsundahl added inline comments.Mar 10 2023, 8:43 AM
compiler-rt/test/asan/TestCases/report_error_summary.cpp
4

dyld64 -> ld64

vitalybuka added inline comments.Mar 10 2023, 9:47 AM
compiler-rt/test/asan/TestCases/report_error_summary.cpp
4

Something like D145810?
I don't have MacOS dev setup to test

usama/rsundahl how do I run an iOS test? I can't do arm, but x86_64 should be equivalent for a dyld problem? Also is there a waterfall for these types of breakages that I don't know about?

rsundahl added inline comments.Mar 10 2023, 4:54 PM
compiler-rt/test/asan/TestCases/report_error_summary.cpp
4

Yes, that would work but we've landed more on the pattern int https://reviews.llvm.org/D130917 which doesn't add any symbols (we chose not to add any symbols so that if this "trick" was needed by two dylibs, there wouldn't be a name clash but you won't need to worry about that in a test.

usama/rsundahl how do I run an iOS test? I can't do arm, but x86_64 should be equivalent for a dyld problem? Also is there a waterfall for these types of breakages that I don't know about?

Not a waterfall, just a test failure due to an issue best read here: https://reviews.llvm.org/D127929 (show older changes). You can't test on iOS but you could test on macOS but even then your minimum deployment target has to be pretty high to see the issue manifest. In other words, you unsurprisingly couldn't anticipate this w/o better documentation and evangelization of the issue.

Not a waterfall, just a test failure due to an issue best read here: https://reviews.llvm.org/D127929 (show older changes). You can't test on iOS but you could test on macOS but even then your minimum deployment target has to be pretty high to see the issue manifest. In other words, you unsurprisingly couldn't anticipate this w/o better documentation and evangelization of the issue.

Thank you @rsundahl . Appreciate the context.