Page MenuHomePhabricator

[ASAN] Fix crash on i?86-linux (32-bit) against glibc 2.27 and later
ClosedPublic

Authored by jakubjelinek on Mar 19 2018, 6:47 AM.

Details

Summary

Running sanitized 32-bit x86 programs on glibc 2.27 crashes at startup, with:

ERROR: AddressSanitizer: SEGV on unknown address 0xf7a8a250 (pc 0xf7f807f4 bp 0xff969fc8 sp 0xff969f7c T16777215)
The signal is caused by a WRITE memory access.
#0 0xf7f807f3 in _dl_get_tls_static_info (/lib/ld-linux.so.2+0x127f3)
#1 0xf7a92599  (/lib/libasan.so.5+0x112599)
#2 0xf7a80737  (/lib/libasan.so.5+0x100737)
#3 0xf7f7e14f in _dl_init (/lib/ld-linux.so.2+0x1014f)
#4 0xf7f6eb49  (/lib/ld-linux.so.2+0xb49)

The problem is that glibc changed the calling convention for the GLIBC_PRIVATE
symbol that sanitizer uses (even when it should not, GLIBC_PRIVATE is exactly
for symbols that can change at any time, be removed etc.), see
https://sourceware.org/ml/libc-alpha/2017-08/msg00497.html

Fixes https://github.com/google/sanitizers/issues/954

Diff Detail

Repository
rL LLVM

Event Timeline

jakubjelinek created this revision.Mar 19 2018, 6:47 AM

glibc 2.27 has added the glob@@GLIBC_2.27 symbols approx. one month after these internal_function changes, so it is closer to that than e.g. trying to parse confstr for glibc 2.27 and later, and is also smaller than trying to parse the confstr string.

Is this related to https://github.com/google/sanitizers/issues/914 or this is another problem?
(also: https://sourceware.org/ml/libc-alpha/2018-02/msg00567.html)

even when it should not

Any chance to get https://sourceware.org/glibc/wiki/ThreadPropertiesAPI moving?

Is a test possible? Or is this covered by an existing test? Which one?

I'm OOO for a few days, Vitaly, please review further.

compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
205

nested ifdefs? I'd prefer to avoid them. Move this code into a separate function, maybe?

206

This code uses // comments.

vitalybuka requested changes to this revision.Mar 21 2018, 11:37 AM
vitalybuka added inline comments.
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
161

Do we need to check for this?
Maybe just always go for dlvsym(RTLD_NEXT, "glob", "GLIBC_2.27") ?

205

maybe?

template<typename GetTlsType>
void GetTlsImpl() {
   void *get_tls_static_info_ptr = dlsym(RTLD_NEXT, "_dl_get_tls_static_info");
   GetTlsType get_tls;
    CHECK_EQ(sizeof(get_tls), sizeof(get_tls_static_info_ptr));
    internal_memcpy(&get_tls, &get_tls_static_info_ptr,
                    sizeof(get_tls_static_info_ptr));
    CHECK_NE(get_tls, 0);
    get_tls(&tls_size, &tls_align);
}

static void GetTls() {
#if defined(__i386__)
   if (!dlvsym(RTLD_NEXT, "glob", "GLIBC_2.27"))
     return GetTlsImpl<>;
#endif
 return GetTlsImpl<>
}
This revision now requires changes to proceed.Mar 21 2018, 11:37 AM
In D44623#1042762, @kcc wrote:

Is this related to https://github.com/google/sanitizers/issues/914 or this is another problem?

Dunno, that issue talks about glibc 2.25, I'm not aware of issues with glibc 2.25, this one is about 2.27.

Any chance to get https://sourceware.org/glibc/wiki/ThreadPropertiesAPI moving?

That is a question for the glibc people, I'm out of the glibc business for 10+ years already.

jakubjelinek added inline comments.Mar 21 2018, 12:00 PM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
161

It is not strictly needed, just an optimization. At least in GCC, if somebody configures the compiler against some version of binutils, glibc etc., we assume that at runtime that version or newer of it is available, so it is an optimization; if somebody configures against glibc 2.27+, it avoids the dlvsym calls on startup of every sanitized app.

205

Attributes on function pointers passed to templates have historically been problematic in several compilers, so I think it is better to avoid that.
E.g. clang++ 6.0.0 r319306 rejects:

typedef void (*F) (int) __attribute__((regparm (3), stdcall));
typedef void (*G) (int);

template <typename T>
void foo (T x)
{
  x (0);
}

void
bar (F x, G y)
{
  foo (x);
  foo (y);
}

g++ before 5 emit 2 different instantiations with the same mangled name and so it fails to compile, g++ 5 ICEs on it, only 6+ handles it right and provide different mangled names for both.
So, if it would be a template, it would need to have 2 different classes that would contain say a typedef in them and that is what the template would call.

vitalybuka added inline comments.Mar 21 2018, 1:03 PM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
161
#if defined(__i386__) && (!defined(__GLIBC_PREREQ) || !__GLIBC_PREREQ(2, 27))
#define CHECK_GET_TLS_STATIC_INFO_VERSION 1
#else
#define CHECK_GET_TLS_STATIC_INFO_VERSION 0
#endif

...

if (CHECK_GET_TLS_STATIC_INFO_VERSION && !dlvsym(RTLD_NEXT, "glob", "GLIBC_2.27")) {

Is a test possible? Or is this covered by an existing test? Which one?

When unpatched, all 781 i386 tests fail with glibc 2.27. I think that should be good enough :-)

Is this related to https://github.com/google/sanitizers/issues/914 or this is another problem?

Could be related, but it can be fixed independently as it is just a different calling convention. See https://github.com/google/sanitizers/issues/954

As mentioned in the comments in that bug, would it be worth factoring out the confstr minor/patch version check from ThreadDescriptorSize and reusing that function here instead of calling dlvsym? Glibc 2.27 is currently released as stable, so pre-2.27 versions after 2.26 should be nearly non-existent right?

jakubjelinek added inline comments.May 9 2018, 8:27 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
161

This will not compile if __GLIBC_PREREQ is not defined, it has to be nested #if or __GLIBC_PREREQ needs to be always defined.

Updated patch, which uses a template with 2 support classes to avoid some code duplication. I must say that I find

#ifndef __GLIBC_PREREQ
# define __GLIBC_PREREQ(x, y) 0
#endif

easier to read over the longer sequence of #ifs, and as I said earlier, the proposed

#if defined(__i386__) && (!defined(__GLIBC_PREREQ) || !__GLIBC_PREREQ(2, 27))

is not valid C/C++ if __GLIBC_PREREQ macro is not defined.

Other than the inline lint warnings, it looks good to me. (The patch is also missing the directory, but it applies otherwise.)

I tried this and the check-sanitizer tests pass with glibc 2.27. A simple 32-bit test program also passes both glibc 2.27 and 2.26.

sanitizer_linux_libcdep.cc
196 ↗(On Diff #145936)

Lint complains here:

sanitizer_linux_libcdep.cc:196:  Namespace should be terminated with "// namespace"  [readability/namespace] [5]
210 ↗(On Diff #145936)

More lint complaints:

sanitizer_linux_libcdep.cc:210:  Tab found; better to use spaces  [whitespace/tab] [1]
sanitizer_linux_libcdep.cc:213:  Tab found; better to use spaces  [whitespace/tab] [1]
Lekensteyn accepted this revision.May 9 2018, 10:15 AM

LGTM, are others also happy with this?

vitalybuka added inline comments.May 9 2018, 11:30 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
174

Do we need this if we do runtime check anyway?
We can't get into GetTlsStaticInfoRegparmCall with CHECK_GET_TLS_STATIC_INFO_VERSION == 0

238

Is this the same as __GLIBC_PREREQ()

jakubjelinek added inline comments.May 9 2018, 11:40 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
174

If you mean the __GLIBC_PREREQ, that is an optimization to avoid doing the runtime check if we know the check is unnecessary.
If the CallGetTls<GetTlsStaticInfoCall>(get_tls_static_info_ptr, ...); call is not ifdefed out through preprocessor, then yes, it is needed.

238

confstr is a runtime check, __GLIBC_PREREQ is a compile time check.

vitalybuka added inline comments.May 9 2018, 12:32 PM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
174

This initialization once per process not thread, so I don't see reasons to complicate code for such minor improvement.

238

Right,
than this is should be similar to dlvsym(RTLD_NEXT, "glob", "GLIBC_2.27"))?

@vitalybuka I think that this patch is making good progress, so I abandoned the other patch.

If forward compatibility is important, I would change the __GLIBC_PREREQ check. Otherwise it looks ready to me.

compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
174

For forward compatibility, what about dropping __GLIBC_PREREQ? E.g.:

#if defined(__i386__) && defined(__GLIBC_PREREQ)
# define CHECK_GET_TLS_STATIC_INFO_VERSION 1
# define DL_INTERNAL_FUNCTIOn ...
#else
# define CHECK_GET_TLS_STATIC_INFO_VERSION 0
#endif

Then a binary built with clang on glibc 2.26 will still work correctly with glibc 2.27.

238

It is not possible, there are multiple versions being checked. D46638 tried to reuse this version check with the above, but that failed because confstr is intercepted by MSAN.

The better alternative seems gnu_get_libc_version, but there were concerns about dlsym being expensive. Not that big of a deal imo, but as the current patch looks fine to me, I dropped the other patch.

vitalybuka added inline comments.May 10 2018, 10:30 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
185

Maybe just inline it here:

typedef void (*get_tls_func)(size_t*, size_t*) __attribute__((regparm(3), stdcall))
kcc added a comment.May 21 2018, 6:15 PM

Is this code review stuck?

jakubjelinek added inline comments.May 22 2018, 6:06 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
185

That isn't possible. If you don't want ifdefs in the code, in particular the
​ if (CHECK_GET_TLS_STATIC_INFO_VERSION &&
​ !dlvsym(RTLD_NEXT, "glob", "GLIBC_2.27"))
​ CallGetTls<GetTlsStaticInfoRegparmCall>(get_tls_static_info_ptr,
​ &tls_size, &tls_align);
code to be ifdefed out if not i?86, then the GetTlsStaticInfoRegparmCall class must be defined even for other targets, even when after optimizations it will not be really used, and so you can't use i?86 specific attributes in there.

vitalybuka accepted this revision.May 22 2018, 3:50 PM
This revision is now accepted and ready to land.May 22 2018, 3:50 PM
petarj added a subscriber: petarj.EditedJun 8 2018, 9:35 AM

What's the status of this change?

I can see five failures after it is applied. Here they are:

Failing Tests (5):

  Builtins-i386-linux :: divsc3_test.c
  LeakSanitizer-AddressSanitizer-x86_64 :: TestCases/Linux/use_tls_dynamic.cc
  LeakSanitizer-Standalone-x86_64 :: TestCases/Linux/use_tls_dynamic.cc
  MemorySanitizer-X86_64 :: dtls_test.c
  MemorySanitizer-lld-X86_64 :: dtls_test.c

Expected Passes    : 43861
Expected Failures  : 221
Unsupported Tests  : 1992
Unexpected Failures: 5

It has been reviewed and is ready for landing.

Could you push it @jakubjelinek? If nobody takes care of it I'll try to merge it later.

Could you push it @jakubjelinek? If nobody takes care of it I'll try to merge it later.

I don't have an account. Can somebody push it for me? Thanks.

Lekensteyn retitled this revision from Fix asan on i?86-linux (32-bit) against glibc 2.27 and later to [ASAN] Fix crash on i?86-linux (32-bit) against glibc 2.27 and later.Jun 8 2018, 10:01 AM
Lekensteyn edited the summary of this revision. (Show Details)
Lekensteyn edited the summary of this revision. (Show Details)Jun 8 2018, 10:05 AM

I'll perform some final tests and push it once it passes.

Lekensteyn added a comment.EditedJun 8 2018, 11:00 AM

I can see five failures after it is applied. Here they are:

Failing Tests (5):

  Builtins-i386-linux :: divsc3_test.c
  LeakSanitizer-AddressSanitizer-x86_64 :: TestCases/Linux/use_tls_dynamic.cc
  LeakSanitizer-Standalone-x86_64 :: TestCases/Linux/use_tls_dynamic.cc
  MemorySanitizer-X86_64 :: dtls_test.c
  MemorySanitizer-lld-X86_64 :: dtls_test.c
 
Expected Passes    : 43861
Expected Failures  : 221
Unsupported Tests  : 1992
Unexpected Failures: 5

Are these tests passing with this patch reverted? Edit 3: no, the same tests (plus more i386 tests failures). Tested with glibc 2.27-3 on Arch Linux and:

  • r334302 llvm
  • r334272 projects/compiler-rt
  • r334281 tools/clang
  • r334287 tools/clang/tools/extra

I just ran check-lsan and can confirm the test failure:

******************** TEST 'LeakSanitizer-AddressSanitizer-x86_64 :: TestCases/Linux/use_tls_dynamic.cc' FAILED ********************
Indirect leak of 1337 byte(s) in 1 object(s) allocated from:
    #0 0x4ec28f in __interceptor_malloc projects/compiler-rt/lib/asan/asan_malloc_linux.cc:121:3
    #1 0x52af83 in main projects/compiler-rt/test/lsan/TestCases/Linux/use_tls_dynamic.cc:27:13
    #2 0x7ffa49a4006a in __libc_start_main (/usr/lib/libc.so.6+0x2306a)

Objects leaked above:
0x61a000000680 (1337 bytes)

-----------------------------------------------------
Suppressions used:
  count      bytes template
      1    1048576 *tls_get_addr*
-----------------------------------------------------

Perhaps the new implementation leaks pointers on the stack, causing false positives?

If I run the reproducer from the linked issue (clang -g -fsanitize=address -m32 empty.c -o empty && ./empty), it crashes without this patch and passes with this patch applied. Building the binary on Arch Linux x86_64 (glib 2.27-3) and running it on Ubuntu 16.04 (glibc 2.23-0ubuntu10) still crashes though. That is expected with this version of the patch as it assumes that libc used for compiling clang matches the runtime libc.

(D46638 was an alternative version that performs a runtime check, but it would need more changes so the current patch seems a good start.)

Edit:

Failing Tests (21):
    AddressSanitizer-Unit :: ./Asan-i386-calls-Dynamic-Test/AddressSanitizer.asm_load_store
    AddressSanitizer-Unit :: ./Asan-i386-calls-Dynamic-Test/AddressSanitizer.asm_rep_movs
    AddressSanitizer-Unit :: ./Asan-i386-calls-Dynamic-Test/AddressSanitizer.asm_rw
    AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer.asm_load_store
    AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer.asm_rep_movs
    AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer.asm_rw
    AddressSanitizer-Unit :: ./Asan-i386-inline-Dynamic-Test/AddressSanitizer.asm_load_store
    AddressSanitizer-Unit :: ./Asan-i386-inline-Dynamic-Test/AddressSanitizer.asm_rep_movs
    AddressSanitizer-Unit :: ./Asan-i386-inline-Dynamic-Test/AddressSanitizer.asm_rw
    AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer.asm_load_store
    AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer.asm_rep_movs
    AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer.asm_rw
    Builtins-i386-linux :: divsc3_test.c
    DataFlowSanitizer-x86_64 :: custom.cc
    LeakSanitizer-AddressSanitizer-x86_64 :: TestCases/Linux/use_tls_dynamic.cc
    LeakSanitizer-Standalone-x86_64 :: TestCases/Linux/use_tls_dynamic.cc
    MemorySanitizer-X86_64 :: Linux/sunrpc.cc
    MemorySanitizer-X86_64 :: Linux/sunrpc_bytes.cc
    MemorySanitizer-X86_64 :: Linux/sunrpc_string.cc
    MemorySanitizer-X86_64 :: dtls_test.c
    ThreadSanitizer-x86_64 :: sunrpc.cc

  Expected Passes    : 32752
  Expected Failures  : 112
  Unsupported Tests  : 11941
  Unexpected Failures: 21
  • AddressSanitizer-Unit: AddressSanitizer: nested bug in the same thread, aborting. Edit 2: this segfaults in ErrorDescription() { internal_memset(this, 0, sizeof(*this)); }:
#0  0x080b2e04 in __asan::ErrorGeneric::ErrorGeneric (this=0xffffbac0, tid=0, pc_=136535675, bp_=4294952244, sp_=4294952232, addr=4102034300, is_write_=false, access_size_=4) at projects/compiler-rt/lib/asan/asan_errors.cc:368
#1  0x0818b479 in __asan::ReportGenericError (pc=136535675, bp=4294952244, sp=4294952232, addr=4102034300, is_write=false, access_size=4, exp=0, fatal=true) at projects/compiler-rt/lib/asan/asan_report.cc:453
#2  0x0818c37a in __asan::__asan_report_load4 (addr=4102034300) at projects/compiler-rt/lib/asan/asan_rtl.cc:114
#3  0x08235e7b in AsmLoad () at projects/compiler-rt/lib/asan/tests/asan_asm_test.cc:181
#4  TestBody () at projects/compiler-rt/lib/asan/tests/asan_asm_test.cc:215
  • divsc3_test.c exited with 1, but no other output.
  • The custom.cc failure: custom.cc.tmp: projects/compiler-rt/test/dfsan/custom.cc:429: void test_gethostname(): Assertion 0 == dfsan_read_label(buf + 2, 2)' failed.`
  • use_tls_dynamic failed as noted above (memleak report)
  • sunrpc tests fail due to: projects/compiler-rt/test/tsan/sunrpc.cc:4:10: fatal error: 'rpc/types.h' file not found. Reported at https://github.com/google/sanitizers/issues/974
  • The dtls_test.c test failure:
MemorySanitizer: use-of-uninitialized-value
    #0 0x493e57 in Thread1 /home/peter/projects/llvm/projects/compiler-rt/test/msan/dtls_test.c:25:7
    #1 0x7f329e96c074 in start_thread (/usr/lib/libpthread.so.0+0x7074)
    #2 0x7f329dce853e in __GI___clone (/usr/lib/libc.so.6+0xf853e)
Lekensteyn closed this revision.Jun 10 2018, 1:12 PM

Oops, I typoed "Differential Revision". Closing manually.

Thanks for the patch, all tests are still passing!

vitalybuka added inline comments.Jun 16 2018, 12:42 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
208

This breaks lsan:

==211269==Sanitizer CHECK failed: compiler-rt/lib/lsan/lsan_interceptors.cc:54 ((!lsan_init_is_running)) != (0) (0, 0)

dlvsym calls malloc when lsan is not initialized

Lekensteyn added inline comments.Jun 16 2018, 2:01 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
208

Ok, I can reproduce this with -fsanitize=leak (but not -fsanitize=address), but only in this setting:

  • #define CHECK_GET_TLS_STATIC_INFO_VERSION 1
  • dlvsym(RTLD_NEXT, "globsomethingnonexisting", ...)

This presumably affects only 32-bit libraries with glibc before 2.27.

I did not see any buildbot complain on this though, how did you catch this issue? It seems a lack of test coverage.

If it does not need an immediate fix (buildbots are not blocked), I could try to work on using the confstr patch again to dynamically query the glibc version which should avoid malloc. Otherwise the patch needs to be reverted.

vitalybuka added inline comments.Jun 16 2018, 3:29 PM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
208

-fsanitize=leak (but not -fsanitize=address), but only in this setting:

Asan malloc interceptor will use internal buffer for allocations which happen before asan is initialized.

I did not see any buildbot complain on this though, how did you catch this issue? It seems a lack of test coverage

Looks like bots have no i386 libs installed. I will try to fix that.

vitalybuka added inline comments.Jun 17 2018, 1:59 AM
compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cc
208

D48265 works for me