Page MenuHomePhabricator

Reland [lsan] Enable LSAN for Android
Needs ReviewPublic

Authored by oontvoo on Mon, Oct 12, 9:39 AM.

Details

Summary

Reland: a2291a58bf1c860d026581fee6fe96019dc25440.

New fixes for the breakages reported in D85927 include:

  • declare a weak decl for dl_iterate_phdr, because it does not exist on older APIs
  • Do not enable leak-sanitizer if api_level is less than 29, because of ld.lld: error: undefined symbol: __aeabi_read_tp for armv7, API level 16.
  • Put back the interceptor for memalign but still opt out intercepting __libc_memalign and cfree because both of these don't exist in Bionic.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
eugenis added inline comments.Tue, Oct 13, 3:58 PM
compiler-rt/lib/lsan/lsan_common.h
39

Mention the link problems due to TLS. Leak detection is flaky on all 32-bit platforms, but we don't disable it just because of that.
(also because the flakes are false-negative, not false-positive, so they make the tool less useful but, strictly speaking, not broken)

compiler-rt/lib/lsan/lsan_common_linux.cpp
108

"turned"

compiler-rt/lib/lsan/lsan_interceptors.cpp
122 ↗(On Diff #297965)

Two more underscores.
INTERCEPT___LIBC_MEMALIGN

115 ↗(On Diff #297653)

It's only a warning, right?
ASan intercepts these function and it is doing just fine.
Anyway, if they do not exist on any of the supported versions (going back to L), then it's good to remove the interceptors.

compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
107

Hmm, this !SANITIZER_ANDROID here goes way back, and I'm not sure why. Must be something about linking the runtime as a shared library, which was once unique to android.

I think this is correct, but how did you run into this?

compiler-rt/lib/sanitizer_common/sanitizer_linux.h
156

We don't really know if it will be 31 or something else. Let's call is "S" for now.

compiler-rt/test/lit.common.cfg.py
69 ↗(On Diff #297965)

The bots use ANDROID_SERIAL environment variable to pick the device, but please use os.environ.get('ADB', 'adb').

oontvoo updated this revision to Diff 298002.Tue, Oct 13, 6:01 PM
oontvoo marked 6 inline comments as done.

updated diff

compiler-rt/lib/lsan/lsan_interceptors.cpp
115 ↗(On Diff #297653)

It's only a warning, right?

Weirdly, no, it's a dlerror. (This made the use_tls_dynamic.cpp test fail).

Anyway, if they do not exist on any of the supported versions (going back to L), then it's good to remove the interceptors.

cfree can definitely be removed. Even glibc says so.
Not too sure about __libc_memalign

compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
107

Hmm, this !SANITIZER_ANDROID here goes way back, and I'm not sure why. Must be something about linking the runtime as a shared library, which was once unique to android.

How do we verify that this change wouldn't blow up in some unexpected way?

I think this is correct, but how did you run into this?

Standalone LSAN was segfaulting.
As it turned out, LSAN uses .prenit_array to ensure __lsan_init is called before everything else.
If we don't remove this !SANITIZER_ANDROID then Standalone LSAN would never init for Android.

eugenis added inline comments.Tue, Oct 13, 7:14 PM
compiler-rt/lib/lsan/lsan_interceptors.cpp
115 ↗(On Diff #297653)

OK. The problem is that if cfree is defined in libc.so and used, and we do not intercept it, we are going to crash in a weird way.

Not sure where the dlerror comes from. See ASAN_INTERCEPT_FUNC - interception failures should not be fatal.

compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
107

Right. Asan is initialized by inserting constructors in each instrumented module. Standalone lsan can not do this.

Check that we do not insert a preinit_array in the shared runtime library. But I think the linker should check this for us.

oontvoo marked an inline comment as done.Tue, Oct 13, 8:00 PM
oontvoo added inline comments.
compiler-rt/lib/lsan/lsan_interceptors.cpp
115 ↗(On Diff #297653)

OK. The problem is that if cfree is defined in libc.so and used, and we do not intercept it, we are going to crash in a weird way.

So we'll leave the cfree and __libc_memalign interceptors for non-android, and skip them on Android.
That should be ok, yes?

Not sure where the dlerror comes from. See ASAN_INTERCEPT_FUNC - interception failures should not be fatal.

Sorry, I should've been more specific. It's not fatal or anything but if app uses dlopen/dlsym and then calls dlerror() expecting to see nullptr (for success), it won't happen. Instead, dlerror() will return the interception failure message ("undefined symbol: __libc_memalign" or "undefined symbol: cfree").
That would be very confusing to users.

compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
107

Check that we do not insert a preinit_array in the shared runtime library.

The libclang_rt-*asan.so ? I don't see any preinit_array section in it:

 objdump -D -j .preinit_array ./llvm_build64/lib/clang/12.0.0/lib/linux/libclang_rt.asan-i686-android.so

./llvm_build64/lib/clang/12.0.0/lib/linux/libclang_rt.asan-i686-android.so:     file format elf32-i386

objdump: section '.preinit_array' mentioned in a -j option, but not found in any input file

It does exist in the libclang_rt.lsan-i686-android.a

oontvoo updated this revision to Diff 298428.Thu, Oct 15, 11:37 AM
oontvoo marked an inline comment as done.

rebase

vitalybuka requested changes to this revision.Thu, Oct 15, 6:45 PM
vitalybuka added a subscriber: vitalybuka.

A lot of tests fails like:

/usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/test/sanitizer_common/android_commands/android_compile.py  /usr/local/google/home/vitalybuka/slow/bbot/llvm_build64/bin/clang  --driver-mode=g++ -stdlib=libstdc++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only  -pie -fuse-ld=gold --target=i686-linux-android --sysroot=/usr/local/google/home/vitalybuka/slow/bbot/android_ndk/standalone-i686/sysroot -B/usr/local/google/home/vitalybuka/slow/bbot/android_ndk/standalone-i686 -fuse-ld=lld  -shared-libasan -O1 -fsanitize-address-use-after-scope /usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/test/asan/TestCases/use-after-scope-nobug.cpp -o /usr/local/google/home/vitalybuka/slow/bbot/compiler_rt_build_android_i686/test/asan/I386AndroidConfig/TestCases/Output/use-after-scope-nobug.cpp.tmp &&  /usr/local/google/home/vitalybuka/slow/bbot/compiler_rt_build_android_i686/test/asan/I386AndroidConfig/TestCases/Output/use-after-scope-nobug.cpp.tmp
WARNING: linker: /data/local/tmp/Output/usr/local/google/home/vitalybuka/slow/bbot/compiler_rt_build_android_i686/test/asan/I386AndroidConfig/TestCases/Output/use-after-scope-nobug.cpp.tmp: unsupported flags DT_FLAGS_1=0x8000001
CANNOT LINK EXECUTABLE "/data/local/tmp/Output/usr/local/google/home/vitalybuka/slow/bbot/compiler_rt_build_android_i686/test/asan/I386AndroidConfig/TestCases/Output/use-after-scope-nobug.cpp.tmp": unknown reloc type 14 @ 0xa8d63390 (638)
Aborted 
Aborted

It's from https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild

This revision now requires changes to proceed.Thu, Oct 15, 6:45 PM

I recommend to extract smaller patches which can be landed separately.

A lot of tests fails like:

/usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/test/sanitizer_common/android_commands/android_compile.py  /usr/local/google/home/vitalybuka/slow/bbot/llvm_build64/bin/clang  --driver-mode=g++ -stdlib=libstdc++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only  -pie -fuse-ld=gold --target=i686-linux-android --sysroot=/usr/local/google/home/vitalybuka/slow/bbot/android_ndk/standalone-i686/sysroot -B/usr/local/google/home/vitalybuka/slow/bbot/android_ndk/standalone-i686 -fuse-ld=lld  -shared-libasan -O1 -fsanitize-address-use-after-scope /usr/local/google/home/vitalybuka/src/llvm.git/llvm-project/compiler-rt/test/asan/TestCases/use-after-scope-nobug.cpp -o /usr/local/google/home/vitalybuka/slow/bbot/compiler_rt_build_android_i686/test/asan/I386AndroidConfig/TestCases/Output/use-after-scope-nobug.cpp.tmp &&  /usr/local/google/home/vitalybuka/slow/bbot/compiler_rt_build_android_i686/test/asan/I386AndroidConfig/TestCases/Output/use-after-scope-nobug.cpp.tmp
WARNING: linker: /data/local/tmp/Output/usr/local/google/home/vitalybuka/slow/bbot/compiler_rt_build_android_i686/test/asan/I386AndroidConfig/TestCases/Output/use-after-scope-nobug.cpp.tmp: unsupported flags DT_FLAGS_1=0x8000001

CANNOT LINK EXECUTABLE "/data/local/tmp/Output/usr/local/google/home/vitalybuka/slow/bbot/compiler_rt_build_android_i686/test/asan/I386AndroidConfig/TestCases/Output/use-after-scope-nobug.cpp.tmp": unknown reloc type 14 @ 0xa8d63390 (638)
Aborted 
Aborted

Can you share the version of the x86 emulator it was using? (http://lab.llvm.org:8011/buildslaves/sanitizer-buildbot1 threw No such resource for me)
Is it pre-28?

oontvoo added a comment.EditedFri, Oct 16, 11:09 PM

I recommend to extract smaller patches which can be landed separately.

Split off the cmake and lit to DD89615
And moved the interceptors stuff to D89616

PTAL

Re: failures:

The issue was that -fno-emulated-tls was set, even for older Android. DD89615 will also take care of not setting that for older android

vitalybuka added inline comments.Tue, Oct 20, 1:41 AM
compiler-rt/lib/lsan/lsan_common_linux.cpp
37

why this need to be weak?

61–65
// LSAN is only enabled with Android-S and up.
if (!HAS_ANDROID_THREAD_PROPERTIES_API)
  return;
106–111
compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
107–108

please clang-format

compiler-rt/test/lsan/TestCases/Linux/log-path_test.cpp
10 ↗(On Diff #298808)

Is this some debug leftover?
This test is executed by any linux build and going to fail on non-undroid.

compiler-rt/test/lsan/TestCases/Linux/use_tls_pthread_specific_static.cpp
22–25 ↗(On Diff #298808)

Could you remove this change to avoid unnececary ifdef

compiler-rt/test/lsan/TestCases/stale_stack_leak.cpp
9 ↗(On Diff #298808)

inconsistent spaces

compiler-rt/test/lsan/TestCases/use_registers.cpp
27 ↗(On Diff #298808)

inconsistent formatting

oontvoo updated this revision to Diff 299379.Tue, Oct 20, 9:02 AM
oontvoo marked 9 inline comments as done.

updated diff

compiler-rt/lib/lsan/lsan_common_linux.cpp
37

Because older Androids do not have dl_iterate_phdr. They will get a compilation error. (dmajor@ shared the error log a few comments back).
It's worth noting that we're already doing this weak-decl for dl_iterate_phdr in quite a few other places.

61–65
// LSAN is only enabled with Android-S and up.
if (!HAS_ANDROID_THREAD_PROPERTIES_API)
  return;

Still need the SANITIZER_ANDROID because
HAS_ANDROID_THREAD_PROPERTIES_API will be false on Non-Android, which means we'd always disable.

106–111

Sorry, why do you want to change the compile-time condition to a runtime check?
We probably don't want to pay this cost if it's not Android.

compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
107–108

Done.

compiler-rt/test/lsan/TestCases/Linux/log-path_test.cpp
10 ↗(On Diff #298808)

Is this some debug leftover?

No. :)

This test is executed by any linux build and going to fail on non-undroid.

On non-android, these macros are mostly no-op.
The run command on line 15 wants to FileCheck a *local* path, which it will not see if we're testing Android, because the log is on the device. So all these new additions are to pull the file from the target device back to the host.

compiler-rt/test/lsan/TestCases/Linux/use_tls_pthread_specific_static.cpp
22–25 ↗(On Diff #298808)

No, because without this condition, the test will fail on Android, as this assertion does not apply to Bionic.

vitalybuka added inline comments.Tue, Oct 20, 2:19 PM
compiler-rt/lib/lsan/lsan_common_linux.cpp
37

@eugenis Why do we care about compiling with old API?

61–65

Oh, you use this to completely shutdown lsan on pre-S
I think it's better to abort with error on initialization with error message, if lsan is enabled and not S.

I don't like the user may think that lsan is enabled when it's not.

106–111

SANITIZER_ANDROID is compile time constant, it will have no runtime overhead
However normal if looks better and and compiles/validates code on all platforms.

compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
107–108

I expected !defined(PIC) on a previous line, but please keep whatever is done clang-format

compiler-rt/test/lsan/TestCases/Linux/log-path_test.cpp
10 ↗(On Diff #298808)

I tried previous snapshot on Linux and it failed. I see you just added these substitutions :)

compiler-rt/test/lsan/TestCases/Linux/use_tls_pthread_specific_static.cpp
22–25 ↗(On Diff #298808)

oh, sorry, you didn't add the assert
I would say just remove the assert everywhere, but up to you.

eugenis added inline comments.Tue, Oct 20, 2:23 PM
compiler-rt/lib/lsan/lsan_common_linux.cpp
37

Because we do not want to ship a copy of the runtime library for every sdk level.
NDK currently ships a single copy - it's part of the toolchain and not of the sysroot.

vitalybuka added inline comments.Tue, Oct 20, 2:33 PM
compiler-rt/lib/lsan/lsan_common_linux.cpp
37

This is compile time check. Do you mean we ship different runtime for each API level?

oontvoo updated this revision to Diff 299513.Tue, Oct 20, 4:38 PM
oontvoo marked 5 inline comments as done.

updated diff

compiler-rt/lib/lsan/lsan_common_linux.cpp
61–65

Oh, you use this to completely shutdown lsan on pre-S

Yes

I think it's better to abort with error on initialization with error message, if lsan is enabled and not S.

LSAN is enabled unconditionally. (Same reason as eugenis@ has given wrt. shipping same runtime for different versions).
As such, we can't error out here because users on pre-S (which is to say everybody right now) will get an error.

I don't like the user may think that lsan is enabled when it's not.

compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
107–108

Pulling !defined(PIC) to the next line was clang-format's doing.
(Not sure why - it clearly fits within 80 character)

compiler-rt/test/lsan/TestCases/Linux/use_tls_pthread_specific_static.cpp
22–25 ↗(On Diff #298808)

oh, sorry, you didn't add the assert
I would say just remove the assert everywhere, but up to you.

compiler-rt/test/lsan/TestCases/use_registers.cpp
27 ↗(On Diff #298808)

(This, again, was also clang-format)

srhines added inline comments.Tue, Oct 20, 5:03 PM
compiler-rt/lib/lsan/lsan_common_linux.cpp
37

No. It gets built for the lowest supported level generally, and we ship that. Making the symbol weak for these lower level targets is correct.

vitalybuka added inline comments.Wed, Oct 21, 1:14 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux.h
162

How about we remove all references to HAS_ANDROID_THREAD_PROPERTIES_API and assume it's true

and add

void InitializeCommonFlags(CommonFlags *cf) {
  // need to record coverage to generate coverage report.
  cf->coverage |= cf->html_cov_report;
  SetVerbosity(cf->verbosity);
  if (!HAS_ANDROID_THREAD_PROPERTIES_API)
    cf->delect_leaks = false;
}

The patch as is still fails with emulator about 71 tests.
Let's try to cut it in pieces and submit safe one by one.
If you don't have committer access let me know and I can select and land some parts myself.

oontvoo updated this revision to Diff 299688.Wed, Oct 21, 7:44 AM
oontvoo marked an inline comment as done.

updated diff

oontvoo added a comment.EditedWed, Oct 21, 7:44 AM

The patch as is still fails with emulator about 71 tests.

Were all these 71 tests from check-lsan? I was planning on letting D89615 go in first, then this patch. (Otherwise, this patch would have to be submitted with LSAN-on-Android still disabled.)

Let's try to cut it in pieces and submit safe one by one.

I'll move the test update to a separate patch and submit them there. The rest of this patch, however, should stay together.

Edit: Moved to D89884. PTAL

oontvoo updated this revision to Diff 299696.Wed, Oct 21, 8:04 AM

updated diff

The patch as is still fails with emulator about 71 tests.

Were all these 71 tests from check-lsan? I was planning on letting D89615 go in first, then this patch. (Otherwise, this patch would have to be submitted with LSAN-on-Android still disabled.)

yes, it was tls related. you can add D89615 as a parent to avoid confusion

vitalybuka added inline comments.Wed, Oct 21, 5:49 PM
compiler-rt/lib/lsan/lsan.cpp
63 ↗(On Diff #299831)

__lsan_default_options can be part of an application and reenabled it again
why not InitializeCommonFlags?

oontvoo added inline comments.Wed, Oct 21, 5:52 PM
compiler-rt/lib/lsan/lsan.cpp
63 ↗(On Diff #299831)

__lsan_default_options can be part of an application and reenabled it again

This overrides the commonflag, so we had to set it again.

why not InitializeCommonFlags?

Yes, see InitializeCommonFlags in sanitizer_flags.cpp below.

oontvoo updated this revision to Diff 299836.Wed, Oct 21, 5:56 PM
oontvoo marked an inline comment as done.

updated

compiler-rt/lib/lsan/lsan.cpp
63 ↗(On Diff #299831)

PS: nvm - this comes before the common-init.