This is an archive of the discontinued LLVM Phabricator instance.

[compiler-rt] Intercept the bcmp() function.
ClosedPublic

Authored by courbet on Feb 19 2019, 1:09 AM.

Details

Summary

I have not introduced a separate hook for bcmp() as I don't think there
should be any reason for a sanitizer to treat it differently from memcmp().

This is only enabled when building on POSIX with GNU extensions.

Context: this is to avoid losing coverage when emitting bcmp() == 0 instead
of memcmp() == 0 in llvm, see https://reviews.llvm.org/D56593.

Event Timeline

courbet created this revision.Feb 19 2019, 1:09 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 19 2019, 1:09 AM
Herald added subscribers: Restricted Project, jdoerfert, delcypher and 2 others. · View Herald Transcript

bcmp is BSD and it first appeard in 4.2BSD... it's not specific to GNU source but compat layer for BSD software.

Please enable it for NetBSD.

Please enable it for NetBSD.

Done, thanks.

courbet updated this revision to Diff 187325.Feb 19 2019, 1:21 AM
  • Enable bcmp interception on NetBSD.

Please enable the tests for NetBSD as well.

<strings.h> should be included unconditionally.

courbet updated this revision to Diff 187327.Feb 19 2019, 1:37 AM
  • Unconditionally test bcmp() in weak_hook_test
krytarowski added inline comments.Feb 19 2019, 1:37 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
873

Can we reuse code with memcmp(3)? bcmp(3) is a subset of memcmp(3).

Please enable the tests for NetBSD as well.

<strings.h> should be included unconditionally.

Just to make sure I understand (I'm not familiar with this code base):

  • Will the tests in Posix/ be compiled only for platforms supporting this ? Is that why <strings.h> can be included unconditionally ?
  • lib/asan/tests/asan_mem_test.cc will be compiled for platforms without <strings.h>, right ?

<strings.h> should be included unconditionally.

Sorry, I've overlooked that this is stringS.h. Including string.hunconditionally is fine. The stringS form maybe is needed for GNU, it's on BSDs too as it's a legacy header, but NetBSD both work and string.h is preferred.

lib/asan/tests/asan_mem_test.cc
247

|| defined(__NetBSD__)

courbet marked an inline comment as done.Feb 19 2019, 1:44 AM
courbet added inline comments.
lib/sanitizer_common/sanitizer_common_interceptors.inc
873

Looking at COMMON_INTERCEPTOR_ENTER implementation for, e.g. tsan:

#define SCOPED_TSAN_INTERCEPTOR(func, ...) \
    SCOPED_INTERCEPTOR_RAW(func, __VA_ARGS__); \
    if (REAL(func) == 0) { \
      Printf("FATAL: ThreadSanitizer: failed to intercept %s\n", #func); \
      Die(); \
    } \
    if (thr->in_rtl > 1) \
      return REAL(func)(__VA_ARGS__); 


#define COMMON_INTERCEPTOR_ENTER(ctx, func, ...)      \
  SCOPED_TSAN_INTERCEPTOR(func, __VA_ARGS__);         \
  TsanInterceptorContext _ctx = {thr, caller_pc, pc}; \
  ctx = (void *)&_ctx;                                \
  (void) ctx;

Given that the implementation calls REAL(func), it feels preferable to call the real bcmp instead of real memcmp, even though calling memcmp does work. But again, I'm not familiar with how you guys typically approach this, so you tell me :)

courbet updated this revision to Diff 187328.Feb 19 2019, 1:51 AM
courbet marked an inline comment as done.
  • guard with NetBSD
MaskRay added inline comments.
lib/asan/tests/asan_mem_test.cc
16

Maybe.. You can also enable it for OpenBSD string.h and FreeBSD strings.h

krytarowski added inline comments.Feb 19 2019, 2:44 AM
lib/asan/tests/asan_mem_test.cc
247

This syntax is not valid.

#if defined(_GNU_SOURCE) || defined(__NetBSD__)

courbet updated this revision to Diff 187346.Feb 19 2019, 4:20 AM
  • ifdef A || defined(B) -> if defined(A) || defined(B)
courbet updated this revision to Diff 187348.Feb 19 2019, 5:02 AM
courbet marked an inline comment as done.
  • Add FreeBSD and OpenBSD.
jyknight added inline comments.Feb 19 2019, 7:08 AM
lib/asan/tests/asan_mem_test.cc
16

Both openbsd and netbsd appear to expose it from <strings.h> (plural) as the primary location too.

On both systems, <string.h> (singular) does a conditional #include of <strings.h>, so this probably works too, but seems better to consolidate around strings.h.

joerg added a subscriber: joerg.Feb 19 2019, 9:09 AM

Actually, since it used to be part of XSI, I would include it unconditionally and only conditionalize it if a system starts to actually drop it.

krytarowski added inline comments.Feb 19 2019, 6:46 PM
lib/asan/tests/asan_mem_test.cc
16

I would include <string.h> unconditionally and only <strings.h> there where needed.

#include <string.h>
#if defined (_GNU_SOURCE)
#include <strings.h>
#endif

FreeBSD doesn't need <strings.h> either.

test/sanitizer_common/TestCases/Posix/weak_hook_test.cc
11

No need for __FreeBSD__.

courbet marked 3 inline comments as done.Feb 20 2019, 2:18 AM
courbet added inline comments.
lib/asan/tests/asan_mem_test.cc
16

I would include <string.h> unconditionally and only <strings.h> there where needed.

Right, and that's consistent with weak_hook_test.

On both systems, <string.h> (singular) does a conditional #include of <strings.h>, so this probably works too, but seems better to consolidate around strings.h.

The freebsd man page has:

A bcmp() function first appeared in 4.2BSD.  Its prototype	existed	previ-
ously in <string.h> before	it was moved to	<strings.h> for	IEEE Std
1003.1-2001 (``POSIX.1'') compliance.

In any case, on all systems strings.h seems to be doing the right thing to accommodate this so I kept strings.h only for _GNU_SOURCE for simplicity.

courbet updated this revision to Diff 187533.Feb 20 2019, 2:19 AM
courbet marked an inline comment as done.
  • Address review comments.
vitalybuka added inline comments.Feb 20 2019, 11:45 AM
lib/asan/tests/asan_mem_test.cc
212–213
lib/sanitizer_common/sanitizer_common_interceptors.inc
868

usually we don't have empty line here

873

Sometimes we do macro
But I'd prefer inline function, e.g.

int memcpy_interceptor_impl(ctx, a1, a2, size, real_fn_ptr) {
....
}

lib/sanitizer_common/sanitizer_platform_interceptors.h
145
#define SANITIZER_INTERCEPT_BCMP (SANITIZER_INTERCEPT_MEMCMP && ( \
    (SI_POSIX && _GNU_SOURCE) || \
    SI_NETBSD || SI_OPENBSD || SI_FREEBSD))

also please reformat it with "git clang-format -f --style=file --extensions=inc,cc,h,c,cpp HEAD^"

courbet updated this revision to Diff 187728.Feb 21 2019, 12:19 AM
courbet marked 4 inline comments as done.
  • Address review comments
  • clang-format

Thanks

lib/asan/tests/asan_mem_test.cc
212–213

The template is more consistent with the two other tests above. If we want to be even more consistent I can define functors:

class MemCmpWrapper {
 public:
  static int transfer(const void *l, const void *r, size_t size) {
    return Ident(memcmp)(l, r, size);
  }
};

but that seems overkill to me.

lib/sanitizer_common/sanitizer_common_interceptors.inc
873

I'm not sure I understand your suggestion correctly, but if you're suggesting we refactor using a function, then the issue is that REAL(x) only works with x a macro parameter, not if x is a function pointer:

# define PTR_TO_REAL(x) real_##x
...
# define REAL(x) __interception::PTR_TO_REAL(x)

Refactoring using a macro would be possible, but it would look quite ugly.

vitalybuka added inline comments.Feb 21 2019, 12:47 PM
lib/sanitizer_common/sanitizer_common_interceptors.inc
873

Looks like you can easily extract into a function code before REAL call

courbet updated this revision to Diff 188108.Feb 25 2019, 1:56 AM
courbet marked an inline comment as done.
  • Refactor common code for memcmp and bcmp.
courbet added inline comments.Feb 25 2019, 1:58 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
873

Right, done.
Note that this has the side-effect of adding one more function in the call stack (see test/asan/TestCases/memcmp_test.cc)

This revision is now accepted and ready to land.Feb 25 2019, 1:54 PM

Thanks for the review.

This revision was automatically updated to reflect the committed changes.
vitalybuka reopened this revision.Feb 26 2019, 10:40 AM
This revision is now accepted and ready to land.Feb 26 2019, 10:40 AM
courbet closed this revision.May 15 2019, 1:40 AM