This is an archive of the discontinued LLVM Phabricator instance.

Switch syscall(2)/__syscall(2) calls to libc calls on NetBSD
ClosedPublic

Authored by krytarowski on Aug 29 2018, 6:10 AM.

Details

Summary

When possible, switch syscall(2)/__syscall(2) calls
to direct calls of internal libc symbols.

Add a new function to detect address of a libc
symbol of a function that could be intercepted.
With the address detector in GetRealLibcAddress(),
an optional interceptor of libc call will be bypassed.

The original approach with syscall(2)/__syscall(2)
wasn't portable across supported ABIs and CPU
architectures. The indirect syscall interface is
also a candidate for removal in future revisions
of NetBSD, as the C language ABI is not a good
domain for serialization of arbitrary functions
arguments.

Switch the following functions to libc calls:

  • internal_mmap()
  • internal_munmap()
  • internal_mprotect()
  • internal_close()
  • internal_open()
  • internal_read()
  • internal_write()
  • internal_ftruncate()
  • internal_stat()
  • internal_lstat()
  • internal_fstat()
  • internal_dup2()
  • internal_readlink()
  • internal_unlink()
  • internal_rename()
  • internal_sched_yield()
  • internal__exit()
  • internal_sleep()
  • internal_execve()
  • NanoTime()
  • internal_clock_gettime()
  • internal_waitpid()
  • internal_getpid()
  • internal_getppid()
  • internal_getdents()
  • internal_lseek()
  • internal_sigaltstack()
  • internal_fork()
  • internal_sigprocmask()
  • internal_sysctl()
  • internal_sigemptyset()
  • internal_sigfillset()
  • GetTid()
  • TgKill()

This revision leaves room for refactoring in subsequent commits.

Diff Detail

Repository
rL LLVM

Event Timeline

krytarowski created this revision.Aug 29 2018, 6:10 AM
krytarowski edited the summary of this revision. (Show Details)Aug 29 2018, 6:13 AM

Just curious, any impact in term of performance ?

Just curious, any impact in term of performance ?

There might be some impact, but it's negligible (and not necessarily slower). We exchange overhead of deserialization of a syscall in syscall(2)/__syscall(2) (on the kernel side) for overhead of calling a libc gateway to the real syscall through a function pointer.

This change is also needed for future planned changes such as addition of internal_sysctl*() and adding interceptors for sysctl*().
Without switch to libc it's harder to get it done sanely, as we have got additional special cases such as sysctlbyname(3) reimplemented for the use of rumpkernels.

There are no regressions observed (on NetBSD) with this commit.

joerg added a comment.Aug 29 2018, 3:24 PM

I don't understand why most of this symbols don't reference the plain system call directly, i.e. _sys_read etc.

_sys_ seems to be available only for a small set of of system calls and is not universal.

$ nm /usr/lib/libc.so|grep _sys_|wc -l
64

I can switch these ones (read, write, close, readlink, sched_yield and wait4 for waitpid), but why?

Personally, I prefer to go for public symbols as with next major libc bump, they might stay around. So not depend too much on the internals that shall be rather not used by 3rd party software.

OK, it looks like these ones are cancel points. I will switch them to _sys_.

joerg added a comment.Aug 31 2018, 5:59 AM

Every system call has a public and internal variant. The former might be replaced by libpthread etc for thread cancellation support, but that's a different topic.

What's the internal variant for e.g. dup2(2) or getpid(2)?

loverszhaokai removed a subscriber: loverszhaokai.
loverszhaokai added a subscriber: loverszhaokai.
krytarowski edited the summary of this revision. (Show Details)

When possible, switch indirect calls to direct calls of internal libc symbols.

Now SANITIZER_NETBSD shares almost nothing with the rest
May I ask you to create sanitizer_netbsd.cc and move code there?

lib/sanitizer_common/sanitizer_linux.cc
204

This is very similar to GetRealFunctionAddress from interception/interception_linux.cc

Maybe just use DEFINE_REAL with INTERCEPT_FUNCTION?

Now SANITIZER_NETBSD shares almost nothing with the rest
May I ask you to create sanitizer_netbsd.cc and move code there?

I will do it!

lib/sanitizer_common/sanitizer_linux.cc
204

I saw this, but I wasn't sure how to do it. The existing macros are designed to install interceptors, and in my case I'm performing the opposite operation.

vitalybuka requested changes to this revision.Sep 26 2018, 5:42 PM

I will do it!

I assume work in progress?

This revision now requires changes to proceed.Sep 26 2018, 5:42 PM

I have been preempted by a conference EuroBSDCon 2018 in Bucharest, Romania.

I had two talks there (1. Userland sanitizers in NetBSD 2. Kernel sanitizers in NetBSD)

http://netbsd.org/~kamil/eurobsdcon2018_sanitizers.html
http://netbsd.org/~kamil/eurobsdcon2018_ksanitizers.html

I'm going to adapt the patch soon and submit to review.

no pressure, just removing from my "Ready to Review" list

Extract NetBSD specific code from common linux.cc to newly created netbsd.cc file.

Improve code comments.

krytarowski edited the summary of this revision. (Show Details)Sep 28 2018, 5:16 AM

I will introduce code fixups and cleanups after getting this merged.

devnexen added inline comments.Sep 28 2018, 5:48 AM
lib/sanitizer_common/sanitizer_netbsd.cc
107

Would it need check too ?

krytarowski edited the summary of this revision. (Show Details)
  • add CHECK()
  • add forgotten TgKill()
  • reduce code duplication with addition of helper macros
vitalybuka accepted this revision.Oct 1 2018, 11:02 AM
vitalybuka added inline comments.
lib/sanitizer_common/sanitizer_linux.cc
240

For this patch I see "Context not available."
Probably it was not created with arc tool.

708–710

for future refactoring it would be nice to collect all non-NETBSD stuff into single #if !SANITIZER_NETBSD

This revision is now accepted and ready to land.Oct 1 2018, 11:02 AM
This revision was automatically updated to reflect the committed changes.