Page MenuHomePhabricator

[sanitizer] Simplify GetTls with dl_iterate_phdr on Linux and use it on musl/FreeBSD
ClosedPublic

Authored by MaskRay on Mar 19 2021, 12:56 AM.

Details

Summary

... so that FreeBSD specific GetTls/glibc specific pthread_self code can be
removed. This also helps FreeBSD arm64/powerpc64 which don't have GetTls
implementation yet.

GetTls is the range of

  • thread control block and optional TLS_PRE_TCB_SIZE
  • static TLS blocks plus static TLS surplus

On glibc, lsan requires the range to include
pthread::{specific_1stblock,specific} so that allocations only referenced by
pthread_setspecific can be scanned.

This patch uses dl_iterate_phdr to collect TLS blocks. Find the one
with dlpi_tls_modid==1 as one of the initially loaded module, then find
consecutive ranges. The boundaries give us addr and size.

This allows us to drop the glibc internal _dl_get_tls_static_info and
InitTlsSize. However, huge glibc x86-64 binaries with numerous shared objects
may observe time complexity penalty, so exclude them for now. Use the simplified
method with non-Android Linux for now, but in theory this can be used with *BSD
and potentially other ELF OSes.

This removal of RISC-V __builtin_thread_pointer makes the code compilable more
compiler versions (added in Clang in 2020-03, added in GCC in 2020-07).

This simplification enables D99566 for TLS Variant I architectures.

Note: as of musl 1.2.2 and FreeBSD 12.2, dlpi_tls_data returned by
dl_iterate_phdr is not desired: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254774
This can be worked around by using __tls_get_addr({modid,0}) instead
of dlpi_tls_data. The workaround can be shared with the workaround for glibc<2.25.

This fixes some tests on Alpine Linux x86-64 (musl)

test/lsan/Linux/cleanup_in_tsd_destructor.c
test/lsan/Linux/fork.cpp
test/lsan/Linux/fork_threaded.cpp
test/lsan/Linux/use_tls_static.cpp
test/lsan/many_tls_keys_thread.cpp

test/msan/tls_reuse.cpp

and test/lsan/TestCases/many_tls_keys_pthread.cpp on glibc aarch64.

The number of sanitizer test failures does not change on FreeBSD/amd64 12.2.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
vitalybuka added inline comments.Mar 23 2021, 11:45 PM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
271

Oh, I forgot this one.
Originally sanitizers were made with Google Style. Still it uses statics more often then unnamed namespace.
Lets keep static.

340

check-sanitizer gets

/usr/bin/ld: SANITIZER_TEST_OBJECTS.sanitizer_linux_test.cpp.x86_64.o: in function `__sanitizer::thread_self_offset_test_func(void*)':
/usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp:193: undefined reference to `__sanitizer::ThreadSelf()'
/usr/bin/ld: /usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp:193: undefined reference to `__sanitizer::ThreadSelfOffset()'
/usr/bin/ld: /usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp:193: undefined reference to `__sanitizer::ThreadSelf()'
/usr/bin/ld: SANITIZER_TEST_OBJECTS.sanitizer_linux_test.cpp.x86_64.o: in function `__sanitizer::SanitizerLinux_ThreadSelfOffset_Test::TestBody()':
/usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp:193: undefined reference to `__sanitizer::ThreadSelf()'
/usr/bin/ld: /usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp:193: undefined reference to `__sanitizer::ThreadSelfOffset()'
/usr/bin/ld: /usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp:193: undefined reference to `__sanitizer::ThreadSelf()'
/usr/bin/ld: SANITIZER_TEST_OBJECTS.sanitizer_linux_test.cpp.x86_64.o: in function `__sanitizer::thread_descriptor_size_test_func(void*)':
/usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp:208: undefined reference to `__sanitizer::ThreadSelf()'rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp:193: undefined reference to `__sanitizer::ThreadSelf()'

also it's still declared in sanitizer_linux.h and used in unittest

MaskRay updated this revision to Diff 333119.Mar 24 2021, 2:19 PM

Delete check-sanitizer found tests.

Hack powerpc64 to use 1664. For unknown reasons it is needed for an internal test.

MaskRay marked an inline comment as done.Mar 24 2021, 2:25 PM

By intercepting pthread_setspecific, we can track thread-specific data keys for lsan in a better way and likely will be able to get rid of ThreadDescriptorSize() and TlsPreTcbSize().

compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
340

Thanks. I did not know check-sanitizer. I used check-asan check-msan check-hwasan check-tsan check-ubsan check-cfi check-lsan ...

MaskRay marked an inline comment as done.Mar 24 2021, 2:26 PM
vitalybuka accepted this revision.Mar 24 2021, 9:11 PM
vitalybuka added inline comments.
compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp
188

Please keep ThreadDescriptorSize like this
ThreadDescriptorSize is version specific so we want to know if it executed on system with unexpected size

This revision is now accepted and ready to land.Mar 24 2021, 9:11 PM
MaskRay updated this revision to Diff 333210.Mar 24 2021, 10:46 PM
MaskRay marked 2 inline comments as done.

add back a unit test

MaskRay updated this revision to Diff 333457.Mar 25 2021, 4:22 PM

Explain why we need the 1664 static TLS surplus.

Performed a large-scale testing internally and it looks good.

MaskRay edited the summary of this revision. (Show Details)Mar 25 2021, 4:54 PM
This revision was landed with ongoing or failed builds.Mar 25 2021, 9:55 PM
This revision was automatically updated to reflect the committed changes.
phosek added a subscriber: phosek.Apr 2 2021, 5:38 PM

After this change, we're also seeing a leak being reported in BoringSSL when running on Debian 9 (stretch) which appears to be a false positive, see fxbug.dev/73487 for more details.

MaskRay added a comment.EditedApr 2 2021, 5:55 PM

After this change, we're also seeing a leak being reported in BoringSSL when running on Debian 9 (stretch) which appears to be a false positive, see fxbug.dev/73487 for more details.

Is the false positive fixed by 58c62fd9768594ec8dd57e8320ba2396bf8b87e5 [sanitizer] Improve accuracy of GetTls on x86/s390? And note that the change only affects Linux, so Fuchsia is not affected.

I have found an approach to work around the Ubuntu 16.04 glibc 2.23 bug https://sourceware.org/pipermail/libc-alpha/2021-April/124697.html without making the code dirty.
I am testing it.

MaskRay reopened this revision.Apr 3 2021, 4:51 PM
This revision is now accepted and ready to land.Apr 3 2021, 4:51 PM
MaskRay updated this revision to Diff 335117.Apr 3 2021, 4:56 PM
MaskRay retitled this revision from [sanitizer] Simplify GetTls with dl_iterate_phdr to [sanitizer] Simplify GetTls with dl_iterate_phdr on Linux.
MaskRay edited the summary of this revision. (Show Details)

Fix Ubuntu 16.04 glibc 2.23

This revision was landed with ongoing or failed builds.Apr 4 2021, 3:36 PM
This revision was automatically updated to reflect the committed changes.
jyknight added inline comments.
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
424

This seems not OK to call this here. I'm seeing an ASAN "stack-use-after-scope" arising from this change, which I believe is because of calling this function during thread setup, before the stack bounds have been set in asan.

#0 0x5620b775273b in __interceptor_confstr compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:3713:5
#1 0x5620b77128ea in __sanitizer::GetLibcVersion(int*, int*, int*) compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp:174:14
#2 0x5620b77129c5 in __sanitizer::GetTls(unsigned long*, unsigned long*) compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp:424:7
#3 0x5620b7712e89 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp:525:3
#4 0x5620b77a8f5d in __asan::AsanThread::SetThreadStackAndTls(__asan::AsanThread::InitOptions const*) compiler-rt/lib/asan/asan_thread.cpp:308:3
#5 0x5620b77a8b32 in __asan::AsanThread::Init(__asan::AsanThread::InitOptions const*) compiler-rt/lib/asan/asan_thread.cpp:235:3
#6 0x5620b77a9076 in __asan::AsanThread::ThreadStart(unsigned long long) compiler-rt/lib/asan/asan_thread.cpp:266:3

Probably should revert the change for now.

DavidSpickett added a subscriber: DavidSpickett.EditedApr 6 2021, 3:34 AM

I've disabled many_tls_keys_pthread.cpp for AArch64 as it requires this change (https://reviews.llvm.org/rG34f8a7f93c98b7c189f9684d0d207ab42aca166f). The commit that enabled it wasn't caught in the initial revert.

If/when you reland it would make sense to include enabling this test as part of it. (passes on our bot when this change is included)

MaskRay reopened this revision.Apr 7 2021, 11:17 AM
This revision is now accepted and ready to land.Apr 7 2021, 11:17 AM
MaskRay updated this revision to Diff 335877.Apr 7 2021, 11:18 AM
MaskRay retitled this revision from [sanitizer] Simplify GetTls with dl_iterate_phdr on Linux to [sanitizer] Simplify GetTls with dl_iterate_phdr on Linux and use it on musl/FreeBSD.
MaskRay edited the summary of this revision. (Show Details)

Fold D99871 into it

Keep InitTlsSize

MaskRay updated this revision to Diff 337901.Apr 15 2021, 2:28 PM
MaskRay edited the summary of this revision. (Show Details)

Special case Linux x86-64

MaskRay updated this revision to Diff 337912.Apr 15 2021, 2:48 PM
MaskRay edited the summary of this revision. (Show Details)

Update description

This revision was landed with ongoing or failed builds.Apr 15 2021, 3:35 PM
This revision was automatically updated to reflect the committed changes.
morehouse added inline comments.
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
197

This is triggering "unused variable" errors on Android: https://lab.llvm.org/buildbot/#/builders/77/builds/5507

/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp:198:12: error: unused variable 'g_use_dlpi_tls_data' [-Werror,-Wunused-variable]
static int g_use_dlpi_tls_data;

PPC bots are also segfaulting during some MSan test after this patch: https://lab.llvm.org/buildbot/#/builders/19/builds/3665, https://lab.llvm.org/buildbot/#/builders/105/builds/8438

I'm not 100% sure, but I think it's likely this patch is the culprit.

******************** TEST 'UBSan-MemorySanitizer-powerpc64le :: TestCases/ImplicitConversion/integer-truncation.c' FAILED ********************
...
Exit Code: 254
Command Output (stderr):
--
clang-13: error: unable to execute command: Segmentation fault (core dumped)
clang-13: error: linker command failed due to signal (use -v to see invocation)

As mentioned above, this seems to be causing segfaults on PPC buildbots; did some builds on a few revisions and this seems to be the cause.

MaskRay added a subscriber: amyk.EditedApr 16 2021, 9:26 AM

PPC bots are also segfaulting during some MSan test after this patch: https://lab.llvm.org/buildbot/#/builders/19/builds/3665, https://lab.llvm.org/buildbot/#/builders/105/builds/8438

I'm not 100% sure, but I think it's likely this patch is the culprit.

******************** TEST 'UBSan-MemorySanitizer-powerpc64le :: TestCases/ImplicitConversion/integer-truncation.c' FAILED ********************
...
Exit Code: 254
Command Output (stderr):
--
clang-13: error: unable to execute command: Segmentation fault (core dumped)
clang-13: error: linker command failed due to signal (use -v to see invocation)

Yes, I've reported it to @amyk . There may be a GNU ld 2.30 bug. I have tested the patch on a Ubunbu 18.04 machine with GNU ld 2.31: there is no segfault.
I think the bot may need to upgrade binutils - this is not the first time I've seen compiler/linker crashes on the ppc64le build bot - I don't even know how to work around it if we don't have more information about the nature.

The -Wunused-variable has been fixed by 376db8eaef3a06cf963eba9e8009048dc93689f0

@MaskRay
Seems that it caused
https://bugs.llvm.org/show_bug.cgi?id=50017
cannot link on s390x

could you please have a look?

Hello!

This seems to have broken the sanitizer build on Linux SPARC: http://lab.llvm.org:8014/#/builders/113/builds/756

[433/4593] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonSymbolizerNoHooks.sparc.dir/sanitizer_symbolizer.cpp.o
[434/4593] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonSymbolizerNoHooks.sparcv9.dir/sanitizer_symbolizer_posix_libcdep.cpp.o
[435/4593] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibcNoHooks.sparc.dir/sanitizer_linux_libcdep.cpp.o
FAILED: projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibcNoHooks.sparc.dir/sanitizer_linux_libcdep.cpp.o 
/usr/bin/c++ -DHAVE_RPC_XDR_H=1 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iprojects/compiler-rt/lib/sanitizer_common -I/var/lib/buildbot/workers/debian-stadler-sparc64/clang-sparc64-linux-multistage/llvm/compiler-rt/lib/sanitizer_common -Iinclude -I/var/lib/buildbot/workers/debian-stadler-sparc64/clang-sparc64-linux-multistage/llvm/llvm/include -I/var/lib/buildbot/workers/debian-stadler-sparc64/clang-sparc64-linux-multistage/llvm/compiler-rt/lib/sanitizer_common/.. -fPIC -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -Wall -std=c++14 -Wno-unused-parameter -O3   -m32 -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fvisibility=hidden -fno-lto -O3 -g -Wno-variadic-macros -nostdinc++ -fno-rtti -Wframe-larger-than=570 -DSANITIZER_SUPPORTS_WEAK_HOOKS=0 -UNDEBUG -std=c++14 -MD -MT projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibcNoHooks.sparc.dir/sanitizer_linux_libcdep.cpp.o -MF projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibcNoHooks.sparc.dir/sanitizer_linux_libcdep.cpp.o.d -o projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibcNoHooks.sparc.dir/sanitizer_linux_libcdep.cpp.o -c /var/lib/buildbot/workers/debian-stadler-sparc64/clang-sparc64-linux-multistage/llvm/compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
/var/lib/buildbot/workers/debian-stadler-sparc64/clang-sparc64-linux-multistage/llvm/compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp: In function ‘void __sanitizer::GetTls(__sanitizer::uptr*, __sanitizer::uptr*)’:
/var/lib/buildbot/workers/debian-stadler-sparc64/clang-sparc64-linux-multistage/llvm/compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp:436:3: error: ‘GetStaticTlsBoundary’ was not declared in this scope
  436 |   GetStaticTlsBoundary(addr, size, &align);
      |   ^~~~~~~~~~~~~~~~~~~~

Shall I open a bug report for that?

glaubitz added inline comments.Apr 20 2021, 10:46 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
195

I think the end of this line should be ||, not && as otherwise GetStaticTlsBoundary() is referenced on SPARC but not defined.

431–433

This snippet will be used on SPARC calling GetStaticTlsBoundary while the definition above is not defined for just SANITIZER_FREEBSD || SANITIZER_LINUX but also requires on of the architecture definitions which causes the build to fail on SPARC.

MaskRay marked an inline comment as done.Apr 20 2021, 10:57 AM
MaskRay added inline comments.
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
195

I thought Linux architectures not listed in this list are irrelevant. (The file does not mention __sparc__.) But it turns out that Linux sparc does have some minimum amount of support. It doesn't implement GetTls, so tens of tests may fail.

Since this is now a generic implementation, you may try simply adding __sparc__ to the list.
Then add __sparc__ to a following list: #if defined(__x86_64__) || defined(__i386__) || defined(__s390__).

Likely the state will be no worse than the previous, probably slightly better.

I don't have a sparc so I cannot test, though.

glaubitz added inline comments.Apr 20 2021, 11:21 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
195

I thought Linux architectures not listed in this list are irrelevant.

I assume that means on these architectures SANITIZER_LINUX will not be defined, will it?

It doesn't implement GetTls, so tens of tests may fail.

Is it hard to implement?

Since this is now a generic implementation, you may try simply adding sparc to the list.

That's what I also tried and fixed the build for me.

I don't have a sparc so I cannot test, though.

Note: There are multiple SPARCs available in the GCC compile farm (https://gcc.gnu.org/wiki/CompileFarm), running both Solaris and Linux. The machine gcc202is running Debian unstable and I'm the admin of it.

MaskRay marked 2 inline comments as done.Apr 20 2021, 11:37 AM
MaskRay added inline comments.
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
195

I thought Linux architectures not listed in this list are irrelevant.

I assume that means on these architectures SANITIZER_LINUX will not be defined, will it?

The other sparc OSes are sparc-rtems and sparc-solaris. The state is likely not very good, either.

https://en.wikipedia.org/wiki/SPARC : "... terminated SPARC design after the completion of the M8."

It doesn't implement GetTls, so tens of tests may fail.

Is it hard to implement?

It was tricky before this patch. This file needed to know the precise size of glibc struct pthread (see the long list enumerating x86-64 glibc struct sizes). It would add lots of maintenance burden and specifically one of the reasons I think we should do this patch. So figuring out a way supporting these things without adding complexity is my goal.

Note: There are multiple SPARCs available in the GCC compile farm (https://gcc.gnu.org/wiki/CompileFarm), running both Solaris and Linux. The machine gcc202is running Debian unstable and I'm the admin of it.

Registered. I can take a look at the build.

glaubitz added inline comments.Apr 20 2021, 12:01 PM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
195

https://en.wikipedia.org/wiki/SPARC : "... terminated SPARC design after the completion of the M8."

True, but Solaris on SPARC is still fully supported. Debian and Gentoo also still support SPARC.

It was tricky before this patch. This file needed to know the precise size of glibc struct pthread (see the long list enumerating x86-64 glibc struct sizes).

OK, but it should be feasible to figure this information out.

It would add lots of maintenance burden and specifically one of the reasons I think we should do this patch. So figuring out a way supporting these things without adding complexity is my goal.

Sounds like a very good idea. Thanks for doing this.

Registered. I can take a look at the build.

Thanks, much appreciated. Let me know when I can be of any help.