Page MenuHomePhabricator

[TSan] Support initialize/finalize hooks in dynamic libraries
ClosedPublic

Authored by yln on Mar 17 2021, 1:24 PM.

Details

Summary

Make TSan runtime initialization and finalization hooks work
even if these hooks are not built in the main executable. When these
hooks are defined in another library that is not directly linked against
the TSan runtime (e.g., Swift runtime) we cannot rely on the "strong-def
overriding weak-def" mechanics and have to look them up via dlsym().

Let's also define hooks that are easier to use from C-only code:

extern "C" void __tsan_on_initialize();
extern "C" int __tsan_on_finalize(int failed);

For now, these will call through to the old hooks. Eventually, we want
to adopt the new hooks downstream and remove the old ones.

This is part of the effort to support Swift Tasks (async/await and
actors) in TSan.

rdar://74256720

Diff Detail

Unit TestsFailed

TimeTest
110 msx64 debian > ThreadSanitizer-x86_64.ThreadSanitizer-x86_64::on_initialize_finalize_hooks.cpp
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fsanitize=thread -Wall -m64 -gline-tables-only -I/mnt/disks/ssd0/agent/llvm-project/compiler-rt/test/tsan/../ -O1 /mnt/disks/ssd0/agent/llvm-project/compiler-rt/test/tsan/on_initialize_finalize_hooks.cpp -o /mnt/disks/ssd0/agent/llvm-project/build/projects/compiler-rt/test/tsan/X86_64Config/Output/on_initialize_finalize_hooks.cpp.tmp.lib -fno-sanitize=thread -shared -DBUILD_LIB=1

Event Timeline

yln requested review of this revision.Mar 17 2021, 1:24 PM
yln created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2021, 1:24 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
yln edited the summary of this revision. (Show Details)Mar 17 2021, 1:27 PM
yln added inline comments.Mar 17 2021, 1:31 PM
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
35 ↗(On Diff #331354)

I think this should be okay?! dlsym() is posix and this is a posix file.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/dlsym.html

vitalybuka added inline comments.Mar 17 2021, 1:47 PM
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

Why swift runtime can't do that?
tsan_acquire and tsan_release are in public interface

delcypher requested changes to this revision.EditedMar 17 2021, 1:54 PM

@yln Could we add a test case for some of this new functionality?

We could add a lit testcase that has something like:

RUN: %clang -fsanitize=thread %s -o %t ; %run %t | FileCheck -check-prefixes=CHECK-NO-HOOK,CHECK %s
RUN: %clang -fsanitize=thread -Dtest_task_set_tsan_hooks %s -o %t.with_hook ; %run %t.with_hook | FileCheck -check-prefixes=CHECK-HOOK,CHECK %s

// CHECK-NO-HOOK-NOT: swift_task_set_tsan_hooks CALLED

#if defined(test_swift_task_set_tsan_hooks)
void swift_task_set_tsan_hooks(void*, void*) {
  // CHECK-HOOK: swift_task_set_tsan_hooks CALLED
  printf("swift_task_set_tsan_hooks CALLED\n");
}
#endif


int main() {
  // CHECK: DONE
  printf("DONE\n");
}

That way we test both swift_task_set_tsan_hooks being available and not being available at run time.

Note I'm not sure if it's safe to call printf from the point where TSan calls swift_task_set_tsan_hooks but if it isn't we can probably work around this by setting a variable and reading it later from main().

This revision now requires changes to proceed.Mar 17 2021, 1:54 PM
yln marked an inline comment as done.Mar 17 2021, 2:20 PM

@yln Could we add a test case for some of this new functionality?

We could add a lit testcase that has something like:

RUN: %clang -fsanitize=thread %s -o %t ; %run %t | FileCheck -check-prefixes=CHECK-NO-HOOK,CHECK %s
RUN: %clang -fsanitize=thread -Dtest_task_set_tsan_hooks %s -o %t.with_hook ; %run %t.with_hook | FileCheck -check-prefixes=CHECK-HOOK,CHECK %s

// CHECK-NO-HOOK-NOT: swift_task_set_tsan_hooks CALLED

#if defined(test_swift_task_set_tsan_hooks)
void swift_task_set_tsan_hooks(void*, void*) {
  // CHECK-HOOK: swift_task_set_tsan_hooks CALLED
  printf("swift_task_set_tsan_hooks CALLED\n");
}
#endif


int main() {
  // CHECK: DONE
  printf("DONE\n");
}

That way we test both swift_task_set_tsan_hooks being available and not being available at run time.

Note I'm not sure if it's safe to call printf from the point where TSan calls swift_task_set_tsan_hooks but if it isn't we can probably work around this by setting a variable and reading it later from main().

I thought about how to test this, but concluded that a test like this does not provide much value. The real tests for this are the integration tests in the Swift repo.
Let me know if you insist on adding a test like this.

compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

It currently does, but we would like to avoid the dlsym() cost in the non-TSan case.

vitalybuka added inline comments.Mar 17 2021, 5:34 PM
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

I guess you can do that by implementing weak callback like this__tsan_default_options
This name looks irrelevant, so similar weak can be added.

yln added inline comments.Mar 17 2021, 6:52 PM
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

Thanks for this idea, I didn't think of it!

Just to make sure we are on the same page, you mean the following, right?

// === compiler-rt/tsan ===

// Don't modify, append-only:
struct tsan_functions {
  tsan_acquire,
  tsan_release,
};

SANITIZER_WEAK_DEFAULT_IMPL
void __tsan_receive_function_ptrs(const struct tsan_functions *fn_ptrs, size_t size) {
  // empty
}

// Call above during initialization


// === Swift runtime ===
void __tsan_receive_function_ptrs(const struct tsan_functions *fn_ptrs, size_t size) {
  // save & use function pointers
}
vitalybuka added inline comments.Mar 17 2021, 7:12 PM
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

Better something more generic:

void __tsan_on_init() {
  p = dlsym(... , "__tsan_acquire")
  swift_task_set_tsan_hooks(p);
}
In D98810#2632928, @yln wrote:

@yln Could we add a test case for some of this new functionality?

We could add a lit testcase that has something like:

RUN: %clang -fsanitize=thread %s -o %t ; %run %t | FileCheck -check-prefixes=CHECK-NO-HOOK,CHECK %s
RUN: %clang -fsanitize=thread -Dtest_task_set_tsan_hooks %s -o %t.with_hook ; %run %t.with_hook | FileCheck -check-prefixes=CHECK-HOOK,CHECK %s

// CHECK-NO-HOOK-NOT: swift_task_set_tsan_hooks CALLED

#if defined(test_swift_task_set_tsan_hooks)
void swift_task_set_tsan_hooks(void*, void*) {
  // CHECK-HOOK: swift_task_set_tsan_hooks CALLED
  printf("swift_task_set_tsan_hooks CALLED\n");
}
#endif


int main() {
  // CHECK: DONE
  printf("DONE\n");
}

That way we test both swift_task_set_tsan_hooks being available and not being available at run time.

Note I'm not sure if it's safe to call printf from the point where TSan calls swift_task_set_tsan_hooks but if it isn't we can probably work around this by setting a variable and reading it later from main().

I thought about how to test this, but concluded that a test like this does not provide much value. The real tests for this are the integration tests in the Swift repo.
Let me know if you insist on adding a test like this.

@yln

I won't insist but let me try to persuade you that having a test in compiler-rt is a good idea. It gives us defence in depth. Sure integration tests in Swift should catch issues but it would be much better to catch issues earlier than that. I would say most people who work on compiler-rt aren't testing Swift. If we have a test in compiler-rt it increases the chance that an issue relating to the compiler-rt side of the contract with Swift will be caught earlier. This might be at an engineer's desk (i.e. running tests before landing a patch) or in LLVM's open source bots.

Given that the cost of writing and running such a test is pretty low I don't really see an argument for not doing it.

We could also make the test case I proposed a lot better by using the tsan_acquire and tsan_release function pointers passed into swift_task_set_tsan_hooks to actually suppress a race injected. So we'd run the test program twice

  1. With swift_task_set_tsan_hooks not defined. In that case we should find the injected race.
  2. With swift_task_set_tsan_hooks defined we'll call tsan_acquire and tsan_release to mask the injected race. We'll verify no race is detected.

Doing this would (very crudely) model how we intend to use the function pointers from Swift and verify that the function pointers we will pass actually do what the Swift runtime expects.

yln planned changes to this revision.Mar 18 2021, 11:51 AM
yln added inline comments.
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

Got it! We could even hijack __tsan_default_options() to avoid the need for any modifications in TSan, but that would mean Swift user code can't use it anymore. Let me think about it.

vitalybuka added inline comments.Mar 18 2021, 1:08 PM
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

alternative
somewhere is swift runtime define weak symbol for some public tsan function e.g. tsan_acquire and tsan_release
it they !=null the we run with tsan so set hooks

yln added inline comments.Mar 18 2021, 4:01 PM
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

We also want to avoid that because weak references disable the "two-level namespace" symbol lookup in the linker/loader, again adding a cost to the non-TSan (common) case.

However, what you suggested earlier---having a weak definition in the TSan runtime and a strong (overriding) one in the Swift runtime sounds promising---we only have the "weak def overridden?" cost in the TSan case. I am just having a little trouble getting it to work. *fingers crossed* that it works the way we want for dylibs.

yln added inline comments.Mar 18 2021, 7:01 PM
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

Unfortunately, trying to provide a strong definition across *.dylibs appears to be a dead end. Our linker guys said:

In general, strong overriding weak is only guaranteed within a linkage unit (e.g. when linking .o files).
There is just one case where strong in one linkage unit overrides a weak in another. That is when the strong dylib/exe was built, it linked against a dylib with a weak-def.

This is done in the name of performance so the loader can do less work.

when the strong dylib/exe was built, it linked against a dylib with a weak-def.

This is also what __tsan_default_options() relies on. It's usually defined in the main binary, which is linked against the TSan runtime. A potential workaround would be to link the TSan runtime when building the Swift runtime, but that is also far from optimal.

Are you okay with looking up a symbol, e.g., __tsan_on_init() via dlsym() and then calling it? (Similar to what this patch does now, but more generic.)

yln added inline comments.Mar 23 2021, 10:10 AM
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

friendly ping 👋 @vitalybuka

vitalybuka added inline comments.Mar 23 2021, 11:52 AM
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
2675 ↗(On Diff #331354)

LGTM, but InitializeInterceptors probably is not a good place, I guess it should be done somewhere near tsan_rtl.cpp:435
I just noticed tsan::OnInitialize
maybe we can just change how we invoke it
I am fine with either new
tsan_on_initialize or OnInitialize.

I suspect OnInitialize is used only by Google internally
So maybe we even create __tsan_on_initialize then later we can switch from and remove OnInitialize.

yln updated this revision to Diff 332794.Mar 23 2021, 2:47 PM

Adopt already-existing tsan::OnInitialize().

yln added inline comments.Mar 23 2021, 2:53 PM
compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
58–77

Should I instead introduce a new extern "C" __tsan_on_initialize()? This would feel a bit cleaner to me than to lookup the mangled C++ name. Let me know what you prefer.

yln retitled this revision from [TSan] Initialize Swift Task runtime hooks to [TSan] Make initialize/finalize hooks work on Darwin.Mar 23 2021, 2:54 PM
yln edited the summary of this revision. (Show Details)
yln updated this revision to Diff 332812.Mar 23 2021, 3:45 PM
  • Introduce new hooks and call old ones
yln added a comment.Mar 23 2021, 3:51 PM

Also added a test. It only explicitly tests the "hooks present" case since all other tests already cover the "hooks absent" case.
@delcypher

compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
58–77

I went ahead and made this change. I have a slight preference for the current version, but I am happy to go either way. See previous diff for the C++ version.

Let me know which version you prefer @vitalybuka

@yln the linked commit from the description uses swift_task_set_tsan_hooks. I don't see any use of that here. Are going for a different approach now?

compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
69

Are we relying on the fact that the Darwin`__tsan_on_initialize` definition here is mangled? If we were to make these extern "C" would dlsym() end up returning a pointer to __tsan_on_initialize() in the TSan runtime instead of the implementation we want? If this is the case we should probably be explicit that we are relying on this.

86

Do these definitions need extern "C"?

yln edited the summary of this revision. (Show Details)Mar 23 2021, 4:59 PM
yln marked an inline comment as done.Mar 23 2021, 5:01 PM

@yln the linked commit from the description uses swift_task_set_tsan_hooks. I don't see any use of that here. Are going for a different approach now?

Updated summary. Swift side is here: https://github.com/apple/swift/pull/36478

compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
69

The important part is that they are static/local to the translation unit (and cannot be looked up). Using the same name here avoids the need for a "call_on_initialize_hook()` function.

86
#define SANITIZER_WEAK_DEFAULT_IMPL \
  extern "C" SANITIZER_INTERFACE_ATTRIBUTE SANITIZER_WEAK_ATTRIBUTE NOINLINE
vitalybuka accepted this revision.Mar 23 2021, 6:44 PM
vitalybuka retitled this revision from [TSan] Make initialize/finalize hooks work on Darwin to [TSan] Support initialize/finalize hooks in dynamic libraries.
vitalybuka edited the summary of this revision. (Show Details)

This way works for Linux as well

bool -> int

vitalybuka edited the summary of this revision. (Show Details)Mar 23 2021, 6:51 PM
yln updated this revision to Diff 332849.Mar 23 2021, 7:15 PM
yln marked an inline comment as done.

Fix clang-tidy:

  • 'const' type qualifier on return type has no effect [clang-diagnostic-ignored-qualifiers]
  • 'auto ptr' can be declared as 'auto *ptr' [llvm-qualified-auto]
yln added a comment.Mar 23 2021, 7:19 PM

Ahh, I accidentally overwrote your very last diff. Will fix. Also thanks so much, Vitaly!

yln updated this revision to Diff 332853.Mar 23 2021, 7:25 PM

Restoring interface changes. Removing const from interface delcartions (clang-tidy).

yln added a comment.Mar 23 2021, 7:26 PM

@delcypher Good to go now?

Harbormaster completed remote builds in B95389: Diff 332846.
delcypher accepted this revision.Mar 24 2021, 11:22 AM

@yln I've suggested some documentation we might want to add to the new interface. I'll leave it up to you if you want to add it. Other than that LGTM.

compiler-rt/include/sanitizer/tsan_interface.h
161 ↗(On Diff #332853)

@yln: Suggestion: Should we document the failed parameter and the return value? E.g.

// `failed` - `0` if the TSan runtime did not detected issues. Non zero otherwise.
// Return value - Returns `0` if the TSan runtime should exit as if no issues occurred. Non zero otherwise.
compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
69

Okay. Sounds like its intentional. Could you add a small comment explaining? It's a subtle point that's very easy to miss when looking at the code.

This revision is now accepted and ready to land.Mar 24 2021, 11:22 AM
yln marked 2 inline comments as done.Mar 24 2021, 12:27 PM
yln added inline comments.
compiler-rt/include/sanitizer/tsan_interface.h
161 ↗(On Diff #332853)

Thanks! Will add when committing.

compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
69

Vitaly updated the code to also dlsym() on Linux (instead of weak default implementation), so this aspect of the implementation went away.

This revision was automatically updated to reflect the committed changes.
yln marked 2 inline comments as done.