This is an archive of the discontinued LLVM Phabricator instance.

[sanitizer] Intercept glibc 2.38 __isoc23_* functions
ClosedPublic

Authored by MaskRay on Aug 26 2023, 1:38 PM.

Details

Summary

strtol("0b1", 0, 0) can be (pre-C23) 0 or (C23) 1.
sscanf("0b10", "%i", &x) is similar. glibc 2.38 introduced
__isoc23_strtol and __isoc23_scanf family functions for binary
compatibility.

When _ISOC2X_SOURCE is defined (implied by _GNU_SOURCE) or
__STDC_VERSION__ > 201710L, __GLIBC_USE_ISOC2X is defined to 1 and
these __isoc23_* symbols are used.

Add __isoc23_ versions for the following interceptors:

  • sanitizer_common_interceptors.inc implements strtoimax/strtoumax. Remove incorrect FIXME about https://github.com/google/sanitizers/issues/321
  • asan_interceptors.cpp implements just strtol and strtoll. The default replace_str mode checks nptr is readable and endptr is writable. atoi reuses the existing strtol interceptor.
  • msan_interceptors.cpp implements strtol family functions and their _l versions. Tested by lib/msan/tests/msan_test.cpp
  • sanitizer_common_interceptors.inc implements scanf family functions.

The strtol family functions are spreaded, which is not great, but the
patch (intended for release/17.x) does not attempt to address the issue.

Add symbols to lib/sanitizer_common/symbolizer/scripts/global_symbols.txt to
support both glibc pre-2.38 and 2.38.

When build bots migrate to glibc 2.38+, we will lose test coverage for
non-isoc23 versions since the existing C++ unittests imply _GNU_SOURCE.
Add test/sanitizer_common/TestCases/{strtol.c,scanf.c}.
They catch msan false positive in the absence of the interceptors.

Fix https://github.com/llvm/llvm-project/issues/64388
Fix https://github.com/llvm/llvm-project/issues/64946

Link: https://lists.gnu.org/archive/html/info-gnu/2023-07/msg00010.html
("The GNU C Library version 2.38 is now available")

Diff Detail

Event Timeline

MaskRay created this revision.Aug 26 2023, 1:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2023, 1:38 PM
Herald added a subscriber: Enna1. · View Herald Transcript
MaskRay requested review of this revision.Aug 26 2023, 1:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2023, 1:38 PM
MaskRay updated this revision to Diff 553767.Aug 26 2023, 3:15 PM

Update test/asan/TestCases/atoll_strict.c and lib/asan/asan_win_dll_thunk.cpp

mgorny accepted this revision.Aug 26 2023, 8:05 PM

Thanks. I'm getting 100% pass with this patch.

This revision is now accepted and ready to land.Aug 26 2023, 8:05 PM
MaskRay updated this revision to Diff 553807.Aug 27 2023, 12:04 PM
MaskRay edited the summary of this revision. (Show Details)

rebase after precommiting asan/windows atoll/strtoll change

vitalybuka accepted this revision.Aug 27 2023, 4:17 PM
vitalybuka added inline comments.
compiler-rt/lib/asan/asan_interceptors.cpp
615

template would be nicer than macro

MaskRay updated this revision to Diff 553850.Aug 28 2023, 12:08 AM

use template in compiler-rt/lib/asan/asan_interceptors.cpp

MaskRay updated this revision to Diff 553854.Aug 28 2023, 12:37 AM

better template. use PosixSpawnImpl as an example

MaskRay marked an inline comment as done.Aug 28 2023, 12:41 AM
This revision was landed with ongoing or failed builds.Aug 28 2023, 12:50 AM
This revision was automatically updated to reflect the committed changes.
brooks added a subscriber: brooks.Aug 28 2023, 5:09 PM
brooks added inline comments.
compiler-rt/lib/msan/msan_interceptors.cpp
1770

Hi @MaskRay , looks like one of the new tests is failing on s390x since it was added, making the CI red:
https://lab.llvm.org/buildbot/#/builders/94/builds/16253

MemorySanitizer:DEADLYSIGNAL
==2727591==ERROR: MemorySanitizer: SEGV on unknown address 0x02aa1bee2000 (pc 0x03ff8064a9e0 bp 0x02aa1bee2516 sp 0x03ffd7e7e438 T2727591)
==2727591==The signal is caused by a UNKNOWN memory access.
    #0 0x3ff8064a9e0 in __GI_____strtod_l_internal /build/glibc-Y6QOHd/glibc-2.31/stdlib/strtod_l.c:686:7
    #1 0x2aa1be69c45 in strtold /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/msan/msan_interceptors.cpp:452:1
    #2 0x2aa1bed0ff3 in main /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/sanitizer_common/TestCases/strtol.c:52:3
    #3 0x3ff80624409 in __libc_start_main /build/glibc-Y6QOHd/glibc-2.31/csu/libc-start.c:308:16
    #4 0x2aa1be23f1f in _start (/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/msan-s390x-Linux/Output/strtol.c.tmp+0x23f1f)
MemorySanitizer can not provide additional info.
SUMMARY: MemorySanitizer: SEGV /build/glibc-Y6QOHd/glibc-2.31/stdlib/strtod_l.c:686:7 in __GI_____strtod_l_internal
==2727591==ABORTING

Any ideas what could cause this?

What is suspicious is that only the long double version crashes, and on s390x long double is passed via implicit reference on the stack, not in registers - could this be a problem?

MaskRay added a comment.EditedAug 31 2023, 11:39 AM

Hi @MaskRay , looks like one of the new tests is failing on s390x since it was added, making the CI red:
https://lab.llvm.org/buildbot/#/builders/94/builds/16253

MemorySanitizer:DEADLYSIGNAL
==2727591==ERROR: MemorySanitizer: SEGV on unknown address 0x02aa1bee2000 (pc 0x03ff8064a9e0 bp 0x02aa1bee2516 sp 0x03ffd7e7e438 T2727591)
==2727591==The signal is caused by a UNKNOWN memory access.
    #0 0x3ff8064a9e0 in __GI_____strtod_l_internal /build/glibc-Y6QOHd/glibc-2.31/stdlib/strtod_l.c:686:7
    #1 0x2aa1be69c45 in strtold /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/msan/msan_interceptors.cpp:452:1
    #2 0x2aa1bed0ff3 in main /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/sanitizer_common/TestCases/strtol.c:52:3
    #3 0x3ff80624409 in __libc_start_main /build/glibc-Y6QOHd/glibc-2.31/csu/libc-start.c:308:16
    #4 0x2aa1be23f1f in _start (/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/msan-s390x-Linux/Output/strtol.c.tmp+0x23f1f)
MemorySanitizer can not provide additional info.
SUMMARY: MemorySanitizer: SEGV /build/glibc-Y6QOHd/glibc-2.31/stdlib/strtod_l.c:686:7 in __GI_____strtod_l_internal
==2727591==ABORTING

Any ideas what could cause this?

What is suspicious is that only the long double version crashes, and on s390x long double is passed via implicit reference on the stack, not in registers - could this be a problem?

@uweigand
I don't know s390x:)

scanf in the source code uses __isoc99_strtold in non-C23-mode or glibc<2.38.
The interceptor __interceptor_strtold unpoisons the endptr argument.
Do you know why you only see a failure in msan's compiler-rt/test/sanitizer_common/TestCases/strtol.c version while not in compiler-rt/lib/msan/tests/msan_test.cpp:1828 TEST_STRTO_FLOAT(strtold, char, )?

You may consider excluding the strtold test with #ifndef __s390__

I've now managed to find the root causes of this problem, see https://reviews.llvm.org/D159378.

Do you know why you only see a failure in msan's compiler-rt/test/sanitizer_common/TestCases/strtol.c version while not in compiler-rt/lib/msan/tests/msan_test.cpp:1828 TEST_STRTO_FLOAT(strtold, char, )?

Hmm, I don't see the compiler-rt unittests running at all (only the standalone tests), neither on the builder nor manually via ninja check-all. Is there anything special I need to do to enable those?