This is an archive of the discontinued LLVM Phabricator instance.

[compiler-rt] change internal internal libc invariants
ClosedPublic

Authored by michaelrj on Aug 18 2021, 12:00 PM.

Details

Summary

llvm-libc is expected to be built with sanitizers and not use interceptors in
the long run. For now though, we have a hybrid process, where functions
implemented in llvm-libc are instrumented, and glibc fills and sanitizer
interceptors fill in the rest.

Current sanitizers have an invariant that the REAL(...) function called from
inside of an interceptor is uninstrumented. A lot of interceptors call strlen()
in order to figure out the size of the region to check/poison. Switch these
callsites over to the internal, unsanitized implementation.

Diff Detail

Event Timeline

michaelrj created this revision.Aug 18 2021, 12:00 PM
michaelrj requested review of this revision.Aug 18 2021, 12:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2021, 12:00 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
hctim added a comment.Aug 19 2021, 1:52 PM

Thank you for the patch.

Can you please update the commit message to be a little more clear and mention that we're changing an invariant here? Something like:

llvm-libc is expected to be built with sanitizers and not use interceptors in
the long run. For now though, we have a hybrid process, where functions
implemented in llvm-libc are instrumented, and glibc fills and sanitizer
interceptors fill in the rest.

Current sanitizers have an invariant that the REAL(...) function called from
inside of an interceptor is uninstrumented. A lot of interceptors call strlen()
in order to figure out the size of the region to check/poison. Switch these
callsites over to the internal, unsanitized implementation.

We'll need to give the same treatment to wcslen, wcsnlen, and strnlen. The interceptors of those functions should call REAL(...), but all other interceptors need to be updated. I'm sure we'll hit other points that'll need patching, but I'm happy to play whack-a-mole now that we're changing the invariant. You'll also need to add the internal_w* variants in sanitizer_libc.cpp and sanitizer_libc.h, but those should be pretty easy, something like:

uptr internal_ wcslen(wchar_t *s) {
  uptr i = 0;
  while (s[i]) i++;
  return i;
}

I've marked some of the sites that need to be updated, but it's not an exhaustive list.

compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc
380

this comment shouldn't be updated

387

don't update

7000–7001

internal_wcslen

7001–7002

internal_wcslen

7012–7013

internal_wcsnlen

7013–7014

internal_wcslen

7047

mitch review marker

michaelrj retitled this revision from [compiler-rt] change REAL(strlen) to internal_strlen to [compiler-rt] change internal internal libc invariants.Aug 19 2021, 2:11 PM
michaelrj edited the summary of this revision. (Show Details)

I'm sure we'll hit other points that'll need patching, but I'm happy to play whack-a-mole now that we're changing the invariant.

For the long-term, it would be nice if we can set up compiler-rt to pick up libc pieces from LLVM libc directly. As in, at compiler-rt build-time, pull the libc pieces necessary from LLVM libc and pack them with the compiler-rt binaries. We will need to employ some namespace discipline of course to get it working, but at the least it has the benefit that we avoid duplication with internal_* implementations.

michaelrj updated this revision to Diff 367619.Aug 19 2021, 2:31 PM
michaelrj marked 6 inline comments as done.

update commit message, add internal_wcslen and internal_wcsnlen, and use them where suggested by the comments.

michaelrj updated this revision to Diff 367621.Aug 19 2021, 2:37 PM

fix typo that was preventing building

hctim added a comment.Aug 19 2021, 2:39 PM

Can you do one more parse over REAL(wcsnlen), REAL(wcslen), and REAL(strnlen) to replace instances of those with internal_* variants when used in an interceptor annotation? Looks like a few are missing.

I'm sure we'll hit other points that'll need patching, but I'm happy to play whack-a-mole now that we're changing the invariant.

For the long-term, it would be nice if we can set up compiler-rt to pick up libc pieces from LLVM libc directly. As in, at compiler-rt build-time, pull the libc pieces necessary from LLVM libc and pack them with the compiler-rt binaries. We will need to employ some namespace discipline of course to get it working, but at the least it has the benefit that we avoid duplication with internal_* implementations.

Yes, a private copy of some libc functions would be nice to have. We already had to optimize internal_memset a little because it showed up in the profiles, it would be great not to duplicate the effort.

michaelrj updated this revision to Diff 367635.Aug 19 2021, 3:35 PM

Replace additional uses of REAL strnlen, wcslen, and wcsnlen

vitalybuka accepted this revision.Aug 20 2021, 11:45 PM

internal_strlen is slower on synthetic benchmark with strlen only, but I tried CTMark and other more real sets and can't measure a difference.
LGTM with extended test

compiler-rt/lib/sanitizer_common/sanitizer_libc.cpp
261

Can you please add tests similar to InternalStrFunctions compiler-rt/lib/sanitizer_common/tests/sanitizer_libc_test.cpp

This revision is now accepted and ready to land.Aug 20 2021, 11:45 PM

add tests for wcslen and wcsnlen

michaelrj marked an inline comment as done.Aug 23 2021, 11:12 AM
hctim accepted this revision.Aug 23 2021, 1:08 PM

LGTM with final nits.

compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc
401

Okay to call REAL(strnlen) here as, if we're using interceptors, the backing function shouldn't be instrumented.

Assumedly the REAL() variant is SIMD and faster.

401

Okay to call REAL(strnlen) here as, if we're using interceptors, the backing function shouldn't be instrumented.

Assumedly the REAL() variant is SIMD and faster.

6981

ditto, REAL(wcslen)

6989

ditto, REAL(wcsnlen)

michaelrj updated this revision to Diff 368209.Aug 23 2021, 2:08 PM
michaelrj marked 5 inline comments as done.

address final nits, will submit once the presubmits are done

This revision was landed with ongoing or failed builds.Aug 24 2021, 9:41 AM
This revision was automatically updated to reflect the committed changes.