This is an archive of the discontinued LLVM Phabricator instance.

[compiler-rt] Allow 3 simultaneous interceptors on Linux
ClosedPublic

Authored by melver on May 22 2023, 5:48 AM.

Details

Summary

Rework Linux (and *BSD) interceptors to allow for up to 3 (2 for *BSD)
simultaneous interceptors. See code comments for details.

The main motivation is to support new sampling sanitizers (in the spirit
of GWP-ASan), that have to intercept few functions. Unfortunately, the
reality is that there are user interceptors that exist in the wild.

To support foreign user interceptors, foreign dynamic analysis
interceptors, and compiler-rt interceptors all at the same time,
including any combination of them, this change enables up to 3
interceptors on Linux (2 on *BSD).

v2:

  • Revert to to the simpler "weak wrapper -(alias)-> __interceptor" scheme on architectures that cannot implement a trampoline efficiently due to complexities of resolving a preemptible symbol (PowerPC64 ELFv2 global entry, and i386 PIC).
  • Avoid duplicate intercepted functions in gen_dynamic_list.py, due to matching interceptor_X and _interceptor_X.
  • Fix s390 __tls_get_offset.

Diff Detail

Event Timeline

melver created this revision.May 22 2023, 5:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 22 2023, 5:48 AM
melver requested review of this revision.May 22 2023, 5:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 22 2023, 5:48 AM
Herald added subscribers: Restricted Project, pcwang-thead. · View Herald Transcript
dvyukov added inline comments.May 22 2023, 6:11 AM
compiler-rt/lib/interception/interception.h
173

Can't we still use an alias to detect if our interceptor is installed or not?
Namely: give the actual function with ASM_JUMP body some new name (actual_func); make func a weak alias to actual_func; use comparison with __actual_func to detect override.

melver marked an inline comment as done.May 22 2023, 8:28 AM
melver added inline comments.
compiler-rt/lib/interception/interception.h
173

Ok, I've changed it. PTAL.

melver updated this revision to Diff 524324.May 22 2023, 8:29 AM
melver marked an inline comment as done.
  • Remove use of section.
  • Implement for .asm files.
melver updated this revision to Diff 524474.May 22 2023, 1:52 PM
melver edited the summary of this revision. (Show Details)
  • Commit message
  • Fix all tests

Overall looks good.
I think not having a section will increase chances of sticking since this is very close to what we are doing now.

compiler-rt/lib/interception/tests/interception_linux_foreign_test.cpp
36

I think this should call double underscore.

41

It would be useful to test "triple" interception as well and also test the order of calls.
One way to test order of all calls is to intercept printf/write and print something to stdout in each interceptor, then an output test can ensure order of all calls.

compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
269 ↗(On Diff #524474)

I think this should be volatile, the block does not return anything, so compiler is free to remove it.

272 ↗(On Diff #524474)

If we define JUMP_INSTR just b/jmp/tail, then we can (1) dedup this macro, (2) reuse it in ASM_INTERCEPTOR_TRAMPOLINE, so that callers don't need to pass it.

compiler-rt/lib/sanitizer_common/sanitizer_stacktrace_printer.cpp
37

Do we now pass custom prefix anywhere? Or is it only for tests?

Could this be split into smaller parts? I bet we are going to land/revert some parts. Better to do with smaller ones.

compiler-rt/lib/interception/interception.h
161

it would be better if you introduce new macros as define to existing one and replace users as NFC patch

compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors_vfork_aarch64.inc.S
43 ↗(On Diff #524474)

you can introduce these macros as NFC to existing implementation

melver updated this revision to Diff 525117.May 24 2023, 5:14 AM
melver marked 7 inline comments as done.
melver retitled this revision from [Interceptors] Rework Linux interceptors allowing for up to 3 interceptors to [compiler-rt] Allow 3 simultaneous interceptors on Linux.
melver edited the summary of this revision. (Show Details)May 24 2023, 5:14 AM
  • Address comments and various cleanups.
melver added inline comments.May 24 2023, 5:18 AM
compiler-rt/lib/interception/tests/interception_linux_foreign_test.cpp
41

Let's keep using unit tests. I verify order by just doing some non-commutative math.

compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
272 ↗(On Diff #524474)

It's called "ASM_TAIL_CALL" now, because apparently riscv is special.

compiler-rt/lib/sanitizer_common/sanitizer_stacktrace_printer.cpp
37

Not sure where else it uses custom prefix - but I didn't want to change API of this function here.

dvyukov added inline comments.May 24 2023, 5:33 AM
compiler-rt/lib/interception/tests/interception_linux_foreign_test.cpp
47

I think it will more readable/maintainable if we always multiply old value by 10 and add a value unique to interceptor (1/2). Then each digit will be journal of calls.

84

Reset these to 0 between calls, then they won't accumulate.

compiler-rt/test/msan/Linux/b64.cpp
63 ↗(On Diff #525117)

If we now always strip __interceptor_ prefix, why do they appear in reports?

melver marked 3 inline comments as done.May 24 2023, 9:17 AM
melver added inline comments.
compiler-rt/test/msan/Linux/b64.cpp
63 ↗(On Diff #525117)

msan seems to have its own function name printing.
Let's do this: https://reviews.llvm.org/D151343

melver updated this revision to Diff 525215.May 24 2023, 9:19 AM
melver marked an inline comment as done.
  • Improve test.
  • Move msan changes to D151343
vitalybuka accepted this revision.May 24 2023, 6:28 PM
vitalybuka added 1 blocking reviewer(s): dvyukov.
MaskRay added inline comments.May 24 2023, 8:45 PM
compiler-rt/lib/interception/interception.h
81

the interceptor implementation is in ___interceptor_func, which is aliased by a weak function __interceptor_func

___interceptor_func occurs twice?

MaskRay added a comment.EditedMay 24 2023, 8:49 PM

x86-64's -m32 mode libclang_rt.ubsan_standalone.so doesn't link

ld.lld: error: relocation R_386_PC32 cannot be used against symbol '__interceptor_signal'; recompile with -fPIC
>>> defined in projects/compiler-rt/lib/ubsan/CMakeFiles/RTUbsan_standalone.i386.dir/ubsan_signals_standalone.cpp.o
>>> referenced by sanitizer_signal_interceptors.inc:55 (/home/maskray/llvm/compiler-rt/lib/ubsan/../sanitizer_common/sanitizer_signal_intercep
tors.inc:55)
>>>               projects/compiler-rt/lib/ubsan/CMakeFiles/RTUbsan_standalone.i386.dir/ubsan_signals_standalone.cpp.o:(__interceptor_trampoline_signal)

__interceptor_signal is a preemptible symbol, which cannot be referenced by a direct access relocation. jmp __interceptor_signal creates an R_386_PC32 relocation (direct access relocation). jmp __interceptor_signal@plt is needed to generate R_386_PLT32.
(See https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected "Non-default visibility ifunc and R_386_PC32" for detail.)

# define ASM_INTERCEPTOR_TRAMPOLINE(name) needs a special case for x86-32. Alternatively, make # define ASM_TAIL_CALL jmp a function-like macro to add the @plt modifier (suffix).

MaskRay added inline comments.May 24 2023, 9:06 PM
compiler-rt/lib/interception/interception.h
191

Unfortunately naked is not a generic attribute. Many targets don't support naked. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77882 for GCC AArch64.

Rework Linux (and *BSD) interceptors to allow for up to 3 (2) simultaneous interceptors. See code comments for details.

What "3 (2) simultaneous interceptors" mean seem confusing (what are counted as interceptors?)... It'd be good to give an example in the summary.

dvyukov added inline comments.May 25 2023, 12:28 AM
compiler-rt/lib/interception/interception.h
81

It's 2 vs 3 underscores :)

191

Is it possible to emit whole function inside of asm block? It can emit symbols, so should be possible, right?
We could create one fake function with inline asm, which will create another complete function.

melver marked 4 inline comments as done.May 26 2023, 8:44 AM

x86-64's -m32 mode libclang_rt.ubsan_standalone.so doesn't link

ld.lld: error: relocation R_386_PC32 cannot be used against symbol '__interceptor_signal'; recompile with -fPIC
>>> defined in projects/compiler-rt/lib/ubsan/CMakeFiles/RTUbsan_standalone.i386.dir/ubsan_signals_standalone.cpp.o
>>> referenced by sanitizer_signal_interceptors.inc:55 (/home/maskray/llvm/compiler-rt/lib/ubsan/../sanitizer_common/sanitizer_signal_intercep
tors.inc:55)
>>>               projects/compiler-rt/lib/ubsan/CMakeFiles/RTUbsan_standalone.i386.dir/ubsan_signals_standalone.cpp.o:(__interceptor_trampoline_signal)

__interceptor_signal is a preemptible symbol, which cannot be referenced by a direct access relocation. jmp __interceptor_signal creates an R_386_PC32 relocation (direct access relocation). jmp __interceptor_signal@plt is needed to generate R_386_PLT32.
(See https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected "Non-default visibility ifunc and R_386_PC32" for detail.)

# define ASM_INTERCEPTOR_TRAMPOLINE(name) needs a special case for x86-32. Alternatively, make # define ASM_TAIL_CALL jmp a function-like macro to add the @plt modifier (suffix).

Ok, I've changed it. PTAL.

compiler-rt/lib/interception/interception.h
191

Doing all in asm now.

melver updated this revision to Diff 526081.May 26 2023, 8:44 AM
melver marked an inline comment as done.
melver edited the summary of this revision. (Show Details)
  • Address comments.

Rework Linux (and *BSD) interceptors to allow for up to 3 (2) simultaneous interceptors. See code comments for details.

What "3 (2) simultaneous interceptors" mean seem confusing (what are counted as interceptors?)... It'd be good to give an example in the summary.

The wording there was confusing (2 for *BSD).

The commit message says "support foreign user interceptors, foreign dynamic analysis interceptors, and compiler-rt interceptors all at the same time, including any combination of them," - so in this case, there can be up to 3 interceptors that want to know when some function was called. We also have to support the case where one of them does not exist, and provide the protocol to follow to make it all work without special builds for each combination.

melver updated this revision to Diff 526160.May 26 2023, 12:38 PM
  • Document implementation better - now that it's all in asm it's hard to immediately see what it's doing.
dvyukov accepted this revision.May 30 2023, 5:39 AM
This revision is now accepted and ready to land.May 30 2023, 5:39 AM
MaskRay accepted this revision.May 30 2023, 9:14 AM

@plt will break on most non-x86 targets. This should be fixed first.

compiler-rt/lib/sanitizer_common/sanitizer_asm.h
98

Many targets don't use @plt or at least, don't recommend @PLT.

@PLT probably should be x86 only.

For example, aarch64 doesn't support @plt.

compiler-rt/lib/sanitizer_common/scripts/gen_dynamic_list.py
128

__? or _?_

MaskRay added inline comments.May 30 2023, 9:18 AM
compiler-rt/lib/sanitizer_common/sanitizer_asm.h
55

#error doesn't need double quotes. Un-capitalized messages are more common.

melver updated this revision to Diff 526706.May 30 2023, 10:27 AM
melver marked 3 inline comments as done.
  • Use @plt only with x86 ELF.
  • No " with #error
  • Improve regex in dynamic sym script.
melver added inline comments.May 30 2023, 10:29 AM
compiler-rt/lib/sanitizer_common/sanitizer_asm.h
55

Changed.

But somehow gcc docs say otherwise: https://gcc.gnu.org/onlinedocs/cpp/Diagnostics.html

The line must consist of complete tokens. It is wisest to make the argument of these directives be a single string constant; this avoids problems with apostrophes and the like.

Maybe that is no longer an issue?

98

Right - getting this right everywhere is a bit confusing to say the least (because I see mentions of "@plt" in tests on aarch64, etc.).

So now added a macro to add this conditionally only if "ELF && x86".

Relying on your expertise here. Thanks!

PTAL.

MaskRay added a comment.EditedMay 30 2023, 1:00 PM

The current version will break most check-asan tests on ppc64le due to complexity of global entry/local entry in to TOC ABI.

(gdb) bt
#0  0x00000001001be578 in __sanitizer::SuppressionContext::Match(char const*, char const*, __sanitizer::Suppression**) ()
#1  0x00000001001a2710 in __asan::IsInterceptorSuppressed(char const*) ()
#2  0x000000010018f774 in __interceptor_memcpy ()
#3  0x00007ffff7b7b1f4 in __libc_sigaction (sig=<optimized out>, act=0x7ffffffff300, oact=0x0) at ../sysdeps/unix/sysv/linux/sigaction.c:51
#4  0x00007ffff7b674d0 in __pthread_initialize_minimal_internal () at nptl-init.c:290
#5  0x00007ffff7b66488 in _init () at ../sysdeps/powerpc/powerpc64/crti.S:78
#6  0x00007ffff7fc02a0 in call_init (env=0x7ffffffff558, argv=0x7ffffffff548, argc=1, l=0x7ffff7ff29b0) at dl-init.c:58
#7  call_init (env=0x7ffffffff558, argv=0x7ffffffff548, argc=1, l=0x7ffff7ff29b0) at dl-init.c:28
#8  _dl_init (main_map=0x7ffff7ff1170, argc=<optimized out>, argv=0x7ffffffff548, env=0x7ffffffff558) at dl-init.c:86
#9  0x00007ffff7fa158c in _dl_start_user () from /lib64/ld64.so.2
(gdb)

To fix Power ELFv2, we can use

#ifdef __powerpc64__
#  define PPC64_GLOBALENTRY(trampoline)                                        \
    "addis 2, 12, .TOC.-" trampoline "@ha\n"                                   \
    "addi 2, 2, .TOC.-" trampoline "@l\n"                                      \
    ".localentry " trampoline ",.-" trampoline "\n"
#else
#  define PPC64_GLOBALENTRY(trampoline)
#endif

#  define DECLARE_WRAPPER(ret_type, func, ...)                                 \
    extern "C" ret_type func(__VA_ARGS__);                                     \
    extern "C" ret_type TRAMPOLINE(func)(__VA_ARGS__);                         \
    extern "C" ret_type __interceptor_##func(__VA_ARGS__)                      \
        INTERCEPTOR_ATTRIBUTE __attribute__((weak)) ALIAS(WRAP(func));         \
    asm(                                                                       \
      ".text\n"                                                                \
      __ASM_WEAK_WRAPPER(func)                                                 \
      ".set " #func ", " SANITIZER_STRINGIFY(TRAMPOLINE(func)) "\n"            \
      ".globl " SANITIZER_STRINGIFY(TRAMPOLINE(func)) "\n"                     \
      ".type  " SANITIZER_STRINGIFY(TRAMPOLINE(func)) ", @function\n"          \
      SANITIZER_STRINGIFY(TRAMPOLINE(func)) ":\n"                              \
      SANITIZER_STRINGIFY(CFI_STARTPROC) "\n"                                  \
      PPC64_GLOBALENTRY(SANITIZER_STRINGIFY(TRAMPOLINE(func)))                 \
      SANITIZER_STRINGIFY(ASM_TAIL_CALL) " __interceptor_"                     \
        SANITIZER_STRINGIFY(ASM_PREEMPTIBLE_SYM(func)) "\n"                    \
      SANITIZER_STRINGIFY(CFI_ENDPROC) "\n"                                    \
      ".size  " SANITIZER_STRINGIFY(TRAMPOLINE(func)) ", "                     \
           ".-" SANITIZER_STRINGIFY(TRAMPOLINE(func)) "\n"                     \
    );

I have some brief description on
https://maskray.me/blog/2023-02-26-linker-notes-on-power-isa#tail-call

RISC-V should have @plt due to the stupid broken GNU toolchain semantics (which may or may not have been fixed to match LLVM these days)

MaskRay requested changes to this revision.May 30 2023, 1:25 PM
This revision now requires changes to proceed.May 30 2023, 1:25 PM

RISC-V should have @plt due to the stupid broken GNU toolchain semantics (which may or may not have been fixed to match LLVM these days)

Is that special handling of undefined weak for R_RISCV_CALL in GNU ld?
The trampoline ensures the ASM_PREEMPTIBLE_SYM jump target is always defined, so the semantics do not apply.


This patch will break Power ELFv2, so I requested changes. I'll update the previous comment to be clearer.

RISC-V should have @plt due to the stupid broken GNU toolchain semantics (which may or may not have been fixed to match LLVM these days)

Is that special handling of undefined weak for R_RISCV_CALL in GNU ld?
The trampoline ensures the ASM_PREEMPTIBLE_SYM jump target is always defined, so the semantics do not apply.

No, there are other weird and broken behaviours when you don't use the PLT variant. They probably don't apply here (AFAIR they were to do with mixing PLT and non-PLT relocations), but it's always better to use the PLT variant.

melver updated this revision to Diff 526968.May 31 2023, 2:34 AM
  • Fix RISC-V preemptible symbol, use @plt (reported by jrtc27)
  • Fix PPC64 tail call, needs global entry (reported by MaskRay)
melver added a comment.EditedMay 31 2023, 2:38 AM

The current version will break most check-asan tests on ppc64le due to complexity of global entry/local entry in to TOC ABI.
[...]

To fix Power ELFv2, we can use

#ifdef __powerpc64__
#  define PPC64_GLOBALENTRY(trampoline)                                        \
    "addis 2, 12, .TOC.-" trampoline "@ha\n"                                   \
    "addi 2, 2, .TOC.-" trampoline "@l\n"                                      \
    ".localentry " trampoline ",.-" trampoline "\n"
#else
#  define PPC64_GLOBALENTRY(trampoline)
#endif

#  define DECLARE_WRAPPER(ret_type, func, ...)                                 \
    extern "C" ret_type func(__VA_ARGS__);                                     \
    extern "C" ret_type TRAMPOLINE(func)(__VA_ARGS__);                         \
    extern "C" ret_type __interceptor_##func(__VA_ARGS__)                      \
        INTERCEPTOR_ATTRIBUTE __attribute__((weak)) ALIAS(WRAP(func));         \
    asm(                                                                       \
      ".text\n"                                                                \
      __ASM_WEAK_WRAPPER(func)                                                 \
      ".set " #func ", " SANITIZER_STRINGIFY(TRAMPOLINE(func)) "\n"            \
      ".globl " SANITIZER_STRINGIFY(TRAMPOLINE(func)) "\n"                     \
      ".type  " SANITIZER_STRINGIFY(TRAMPOLINE(func)) ", @function\n"          \
      SANITIZER_STRINGIFY(TRAMPOLINE(func)) ":\n"                              \
      SANITIZER_STRINGIFY(CFI_STARTPROC) "\n"                                  \
      PPC64_GLOBALENTRY(SANITIZER_STRINGIFY(TRAMPOLINE(func)))                 \
      SANITIZER_STRINGIFY(ASM_TAIL_CALL) " __interceptor_"                     \
        SANITIZER_STRINGIFY(ASM_PREEMPTIBLE_SYM(func)) "\n"                    \
      SANITIZER_STRINGIFY(CFI_ENDPROC) "\n"                                    \
      ".size  " SANITIZER_STRINGIFY(TRAMPOLINE(func)) ", "                     \
           ".-" SANITIZER_STRINGIFY(TRAMPOLINE(func)) "\n"                     \
    );

I have some brief description on
https://maskray.me/blog/2023-02-26-linker-notes-on-power-isa#tail-call

Thanks - should be fixed (for the inline asm and normal asm version).

powerpc64{,le} ELFv2 breakages are mostly fixed, but -shared-libsan will fail. -shared-libsan is used by GCC, so we should fix it, not just UNSUPPORTED the test (compiler-rt/test/asan/TestCases/replaceable_new_delete_shared.cpp).

(gdb) bt
#0  0x00007ffff75818f8 in __plt___interceptor_memcpy () from /home/maskray/llvm/out/debug/lib/clang/17/lib/powerpc64le-unknown-linux-gnu/libclang_rt.asan.so
#1  0x00007ffff6ccb1f4 in __libc_sigaction (sig=<optimized out>, act=0x7ffffffff260, oact=0x0) at ../sysdeps/unix/sysv/linux/sigaction.c:51
#2  0x00007ffff6cb74d0 in __pthread_initialize_minimal_internal () at nptl-init.c:290
#3  0x00007ffff6cb6488 in _init () at ../sysdeps/powerpc/powerpc64/crti.S:78
#4  0x00007ffff7fc02a0 in call_init (env=0x7ffffffff4b8, argv=0x7ffffffff4a8, argc=1, l=0x7ffff7ff44d0) at dl-init.c:58
#5  call_init (env=0x7ffffffff4b8, argv=0x7ffffffff4a8, argc=1, l=0x7ffff7ff44d0) at dl-init.c:28
#6  _dl_init (main_map=0x7ffff7ff1170, argc=<optimized out>, argv=0x7ffffffff4a8, env=0x7ffffffff4b8) at dl-init.c:86
#7  0x00007ffff7fa158c in _dl_start_user () from /lib64/ld64.so.2

For powerpc64{,le} ELFv2, we should use b foo; nop instead of b foo.
Linkers rewrite the nop if it is preemptible.

Therefore, this part

SANITIZER_STRINGIFY(ASM_TAIL_CALL) " __interceptor_"                     \
        SANITIZER_STRINGIFY(ASM_PREEMPTIBLE_SYM(func)) "\n"

needs adjustment for powerpc64.

# objdump -wdr projects/compiler-rt/test/asan/POWERPC64LELinuxConfig/TestCases/Output/use-after-free.cpp.tmp
0000000000000018 <__interceptor_trampoline_memcpy>:
      18:       00 00 4c 3c     addis   r2,r12,0        18: R_PPC64_REL16_HA    .TOC.
      1c:       00 00 42 38     addi    r2,r2,0 1c: R_PPC64_REL16_LO    .TOC.+0x4
      20:       00 00 00 48     b       20 <__interceptor_trampoline_memcpy+0x8>        20: R_PPC64_REL24       __interceptor_memcpy


# objdump -wdr lib/clang/17/lib/powerpc64le-unknown-linux-gnu/libclang_rt.asan.so
00000000001d2cb8 <__interceptor_trampoline_memcpy>:
  1d2cb8:       04 00 41 e8     ld      r2,4(r1)
  1d2cbc:       a0 86 42 38     addi    r2,r2,-31072
  1d2cc0:       30 ec 01 48     b       1f18f0 <__plt___interceptor_memcpy>
melver updated this revision to Diff 527180.May 31 2023, 1:18 PM
  • Fix ASM_PREEMPTIBLE_SYM for ppc64 (it wants a nop after 'b sym').

powerpc64{,le} ELFv2 breakages are mostly fixed, but -shared-libsan will fail. -shared-libsan is used by GCC, so we should fix it, not just UNSUPPORTED the test (compiler-rt/test/asan/TestCases/replaceable_new_delete_shared.cpp).

[...]

For powerpc64{,le} ELFv2, we should use b foo; nop instead of b foo.
Linkers rewrite the nop if it is preemptible.

Ouch - thanks, done.

I'm starting to think if there was a nice way to get what we want in plain C++, but it'd require updating all interceptors to take the number of function args e.g. INTERCEPTOR{1,2,3,4,5}.

But that doesn't solve the problem with pure .S implementations, which may not be tested in all these cases, and without the inline asm() version we'd not catch these issues as easily.
I'm torn, but probably keeping it all in asm is, paradoxically, more reliable in this case...

MaskRay accepted this revision.Jun 1 2023, 12:48 PM
MaskRay added inline comments.
compiler-rt/lib/interception/tests/interception_linux_foreign_test.cpp
39

For * and +, the precedence is apparent and we don't usually add parens.

This revision is now accepted and ready to land.Jun 1 2023, 12:48 PM

If powerpc64{,le} ELFv2 continue to be a problem, perhaps we should give up 3 interceptors for it.

melver updated this revision to Diff 527578.Jun 1 2023, 1:01 PM
melver marked an inline comment as done.
  • Simplify arith expressions in tests.
This revision was landed with ongoing or failed builds.Jun 7 2023, 12:11 AM
This revision was automatically updated to reflect the committed changes.
melver reopened this revision.Jun 7 2023, 1:22 AM
This revision is now accepted and ready to land.Jun 7 2023, 1:22 AM
melver updated this revision to Diff 529856.Jun 9 2023, 1:52 AM
melver edited the summary of this revision. (Show Details)
  • Do not implement trampoline for i386 and powerpc64 due to complexities of resolving a preemptible symbol. We may implement them properly at a later point if the need arises (howeer, I suspect nobody needs this on i386 anyway).
  • Attempt to fix s390.
This revision was landed with ongoing or failed builds.Jun 9 2023, 2:32 AM
This revision was automatically updated to reflect the committed changes.
ro added a subscriber: ro.Jun 13 2023, 1:55 AM

This patch caused 4 regressions on Solaris/SPARC:

SanitizerCommon-ubsan-sparc-SunOS :: Posix/replace_sigaction.cpp
SanitizerCommon-ubsan-sparc-SunOS :: Posix/signal.cpp
SanitizerCommon-ubsan-sparcv9-SunOS :: Posix/replace_sigaction.cpp
SanitizerCommon-ubsan-sparcv9-SunOS :: Posix/signal.cpp

E.g. Posix/replace_sigaction.cpp FAILs with

UndefinedBehaviorSanitizer:DEADLYSIGNAL
UndefinedBehaviorSanitizer:DEADLYSIGNAL
UndefinedBehaviorSanitizer: nested bug in the same thread, aborting.

gdb shows

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0x00000008 in ?? ()
(gdb) bt
#0  0x00000008 in ?? ()
#1  0xff1ae3a8 in defrag () from /usr/lib/ld.so.1
#2  0x0005aedc in install<void (int)>(void (*)(int), sigaction*) (
    handler=0x5a4f8 <sahandler(int)>, prev=0x0)
    at /vol/llvm/src/llvm-project/local/compiler-rt/test/sanitizer_common/TestCases/Posix/replace_sigaction.cpp:23
#3  0x0005a604 in test<void (int), void (int)>(void (*)(int), void (*)(int)) (
    from=0x5a4f8 <sahandler(int)>, to=0x5a4f8 <sahandler(int)>)
    at /vol/llvm/src/llvm-project/local/compiler-rt/test/sanitizer_common/TestCases/Posix/replace_sigaction.cpp:40
#4  0x0005a560 in testAll<void (int)>(void (*)(int)) (
    to=0x5a4f8 <sahandler(int)>)
    at /vol/llvm/src/llvm-project/local/compiler-rt/test/sanitizer_common/TestCases/Posix/replace_sigaction.cpp:55
#5  0x0005a4c8 in main ()
    at /vol/llvm/src/llvm-project/local/compiler-rt/test/sanitizer_common/TestCases/Posix/replace_sigaction.cpp:62

truss (the Solaris syscall tracer) gives

Incurred fault #6, FLTBOUNDS  %pc = 0x00000008
  siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008
Received signal #11, SIGSEGV [caught]
  siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008

I suspected that this section in compiler-rt/lib/sanitizer_common/sanitizer_asm.h

#if defined(__x86_64__) || defined(__i386__) || defined(__sparc__)
# define ASM_TAIL_CALL jmp

is the culprit, and indeed this doesn't set the jmp insn's delay slot at all. If you disassemble the resulting code, you get

(gdb) x/10i __interceptor_trampoline_signal
   0x594c4 <signal>:	jmp  4
   0x594c8 <sigaction>:	jmp  8
   0x594cc <__interceptor_signal>:	save  %sp, -96, %sp

which explains the SEGV. I managed to avoid this with the following snippet:

--- a/compiler-rt/lib/sanitizer_common/sanitizer_asm.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_asm.h
@@ -66,7 +66,7 @@
 # define ASM_SIZE(symbol) .size symbol, .-symbol
 # define ASM_SYMBOL(symbol) symbol
 # define ASM_SYMBOL_INTERCEPTOR(symbol) symbol
-# if defined(__i386__) || defined(__powerpc__)
+# if defined(__i386__) || defined(__powerpc__) || defined(__sparc__)
 // For details, see interception.h
 #  define ASM_WRAPPER_NAME(symbol) __interceptor_##symbol
 #  define ASM_TRAMPOLINE_ALIAS(symbol, name)                                   
\

I'd expect that other targets with delay slots have similar problems.

In D151085#4416709, @ro wrote:
is the culprit, and indeed this doesn't set the `jmp` insn's delay slot at all.  If you disassemble the resulting code, you get

(gdb) x/10i __interceptor_trampoline_signal

0x594c4 <signal>:	jmp  4
0x594c8 <sigaction>:	jmp  8
0x594cc <__interceptor_signal>:	save  %sp, -96, %sp
which explains the `SEGV`.  I managed to avoid this with the following snippet:
  • a/compiler-rt/lib/sanitizer_common/sanitizer_asm.h

+++ b/compiler-rt/lib/sanitizer_common/sanitizer_asm.h
@@ -66,7 +66,7 @@

  1. define ASM_SIZE(symbol) .size symbol, .-symbol
  2. define ASM_SYMBOL(symbol) symbol
  3. define ASM_SYMBOL_INTERCEPTOR(symbol) symbol
  4. if defined(i386) || defined(powerpc)

+# if defined(i386) || defined(powerpc) || defined(sparc)
// For details, see interception.h

  1. define ASM_WRAPPER_NAME(symbol) __interceptor_##symbol
  2. define ASM_TRAMPOLINE_ALIAS(symbol, name)

\

I'd expect that other targets with delay slots have similar problems.

Do you think you could fix up ASM_TAIL_CALL for sparc, or just revert to the interception without trampoline?
In the latter case, I can just commit the above if you like - let me know what you prefer.

ro added a comment.Jun 13 2023, 4:26 AM

Do you think you could fix up ASM_TAIL_CALL for sparc, or just revert to the interception without trampoline?
In the latter case, I can just commit the above if you like - let me know what you prefer.

I tried the following:

--- a/compiler-rt/lib/sanitizer_common/sanitizer_asm.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_asm.h
@@ -53,6 +53,10 @@
 # define ASM_TAIL_CALL tail
 #endif
 
+#if defined(__sparc__)
+# define ASM_DELAY_INSN nop
+#endif
+
 #if defined(__ELF__) && defined(__x86_64__) || defined(__i386__) || \
     defined(__riscv)
 # define ASM_PREEMPTIBLE_SYM(sym) sym@plt
@@ -88,6 +92,7 @@
          __interceptor_trampoline_##name:                                      \
                  CFI_STARTPROC;                                                \
                  ASM_TAIL_CALL ASM_PREEMPTIBLE_SYM(__interceptor_##name);      \
+                 ASM_DELAY_INSN;                                               \
                  CFI_ENDPROC;                                                  \
          ASM_SIZE(__interceptor_trampoline_##name)
 #  define ASM_INTERCEPTOR_TRAMPOLINE_SUPPORT 1

but that didn't even get the nop into the object. Somehow, I also failed to use -save-temps to check the assembler output. Even if this can be made to work, it would need extra scrutiny by someone who really understands the SPARC ISA.

At least for now, I guess it's best to go for my workaround patch.

In D151085#4417052, @ro wrote:

Do you think you could fix up ASM_TAIL_CALL for sparc, or just revert to the interception without trampoline?
In the latter case, I can just commit the above if you like - let me know what you prefer.

I tried the following:

--- a/compiler-rt/lib/sanitizer_common/sanitizer_asm.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_asm.h
@@ -53,6 +53,10 @@
 # define ASM_TAIL_CALL tail
 #endif
 
+#if defined(__sparc__)
+# define ASM_DELAY_INSN nop
+#endif
+
 #if defined(__ELF__) && defined(__x86_64__) || defined(__i386__) || \
     defined(__riscv)
 # define ASM_PREEMPTIBLE_SYM(sym) sym@plt
@@ -88,6 +92,7 @@
          __interceptor_trampoline_##name:                                      \
                  CFI_STARTPROC;                                                \
                  ASM_TAIL_CALL ASM_PREEMPTIBLE_SYM(__interceptor_##name);      \
+                 ASM_DELAY_INSN;                                               \
                  CFI_ENDPROC;                                                  \
          ASM_SIZE(__interceptor_trampoline_##name)
 #  define ASM_INTERCEPTOR_TRAMPOLINE_SUPPORT 1

but that didn't even get the nop into the object. Somehow, I also failed to use -save-temps to check the assembler output. Even if this can be made to work, it would need extra scrutiny by someone who really understands the SPARC ISA.

There are 2 places that need to be kept in sync per comment "// Keep trampoline implementation in sync with interception/interception.h".

At least for now, I guess it's best to go for my workaround patch.

But I think it's good to just do the simple fix first, and if you feel that you need the support as described in this patch, then we can see if it's possible to get it to work.

https://reviews.llvm.org/rG2c1ec06d45a9

ro added a comment.Jun 13 2023, 5:24 AM
In D151085#4417052, @ro wrote:

Do you think you could fix up ASM_TAIL_CALL for sparc, or just revert to the interception without trampoline?
In the latter case, I can just commit the above if you like - let me know what you prefer.

I tried the following:

--- a/compiler-rt/lib/sanitizer_common/sanitizer_asm.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_asm.h
@@ -53,6 +53,10 @@
 # define ASM_TAIL_CALL tail
 #endif
 
+#if defined(__sparc__)
+# define ASM_DELAY_INSN nop
+#endif
+
 #if defined(__ELF__) && defined(__x86_64__) || defined(__i386__) || \
     defined(__riscv)
 # define ASM_PREEMPTIBLE_SYM(sym) sym@plt
@@ -88,6 +92,7 @@
          __interceptor_trampoline_##name:                                      \
                  CFI_STARTPROC;                                                \
                  ASM_TAIL_CALL ASM_PREEMPTIBLE_SYM(__interceptor_##name);      \
+                 ASM_DELAY_INSN;                                               \
                  CFI_ENDPROC;                                                  \
          ASM_SIZE(__interceptor_trampoline_##name)
 #  define ASM_INTERCEPTOR_TRAMPOLINE_SUPPORT 1

but that didn't even get the nop into the object. Somehow, I also failed to use -save-temps to check the assembler output. Even if this can be made to work, it would need extra scrutiny by someone who really understands the SPARC ISA.

There are 2 places that need to be kept in sync per comment "// Keep trampoline implementation in sync with interception/interception.h".

Indeed: that way I got the nop into the object. However, the jmp insn still jumps to an absolute (unmapped) address, causing the SEGV. I don't yet see why.

At least for now, I guess it's best to go for my workaround patch.

But I think it's good to just do the simple fix first, and if you feel that you need the support as described in this patch, then we can see if it's possible to get it to work.

https://reviews.llvm.org/rG2c1ec06d45a9

Excellent, thanks a lot. After all, the trampoline is just an optimization, so correctness should go first.