Page MenuHomePhabricator

Allow system header to provide their own implementation of some builtin
ClosedPublic

Authored by serge-sans-paille on Dec 5 2019, 11:09 AM.

Details

Summary

If a system header provides an (inline) implementation of some of their function, clang still matches on the function name and generate the appropriate llvm builtin, e.g. memcpy. This behavior is in line with glibc recommendation « users may not provide their own version of symbols » but doesn't account for the fact that glibc itself can provide inline version of some functions.

It is the case for the memcpy function when -D_FORTIFY_SOURCE=1 is on. In that case an inline version of memcpy calls __memcpy_chk, a function that performs extra runtime checks. Clang currently ignores the inline version and thus provides no runtime check.

This code fixes the issue by detecting functions whose name is a builtin name but also have a system-provided implementation.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptDec 5 2019, 11:09 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
serge-sans-paille marked an inline comment as done.

test case updated

Thanks for looking into this!

I dunno what implications this has elsewhere, but I'd like to note a few things about this general approach:

  • AIUI, this'll only work for FORTIFY functions which are literally calling __builtin___foo_chk; an equivalent implementation without such a builtin, like below, should be blocked by other code in clang, since it results in infinite recursion. glibc implements quite a few (most?) FORTIFY'ed functions in this way.
void *memcpy(void *a, void *b, size_t c) {
  if (__builtin_object_size(a, 0) == -1)
    return memcpy(a, b, c);
  return __memcpy_chk(a, b, c, __builtin_object_size(a, 0));
}
  • This change only gives us -D_FORTIFY_SOURCE=1-level checking here. If we want to offer better support for -D_FORTIFY_SOURCE=2 (including potentially FORTIFY's compile-time diagnostics), which is AIUI what most of the world uses, we'll need to add clang-specific support into glibc's headers. At that point, this patch becomes a nop, since the FORTIFY'ed memcpy becomes an overload for the built-in one.

In other words, I think this patch will improve our support for the pieces of glibc's FORTIFY that're implemented with __builtin___*_chk, but I don't think there's much more room to incrementally improve beyond that before we need to start hacking on glibc directly. If we get to that point, the changes we need to make in glibc will likely obviate the need for this patch.

So if your overarching goal here is to make clang support glibc's FORTIFY as-is a bit better, that's great and I'm personally cool with this patch (again, not being deeply aware of what interactions this may have in a non-FORTIFY'ed context). If the intent is to build off of this to try and make glibc's FORTIFY story with clang a great one, I'm unsure how useful this patch will be in the long run.

clang/lib/AST/Decl.cpp
3050

style nit: const FunctionDecl * is the preferred ordering in most of LLVM

3221

style nit: please don't use braces here or below, for consistency with nearby code.

It looks like this patch accidentally suppresses certain warnings when _FORTIFY_SOURCE is enabled. For example, Sema::CheckMemaccessArguments only gets called for recognized builtin functions.

Fix interaction with fortify warnings (cc @efriedma)

Also I've checked interactions with the test suite I wrote here https://github.com/serge-sans-paille/fortify-test-suite/ and clang now passes all the dynamic check, and only fails a few static checks, which is a great improvement. I still need to double check all the examples pointed out by @george.burgess.iv .

  • Be less intrusive in the getBuiltinID() system, just hook inside CGExpr.cpp
  • Take into account @george.burgess.iv remark on builtins referencing themselves in an override. Add a test case for that scenario.
clang/lib/AST/Decl.cpp
3049

Note to reviewers: I'm not super happy with that name.

Should we also have a quick test-case in Sema/ verifying that we still get the warnings that Eli mentioned?

clang/lib/AST/Decl.cpp
3049

Yeah, isReplaceableBuiltin() sounds like "can this be replaced at all?" rather than "is this acting as a replacement for something else?"

isReplacementForBuiltin() pops to mind, though in a lot of cases, these functions may end up calling the "actual" builtin (as you handle later in the patch), so maybe isProxyForBuiltin() is better?

3053

const FunctionDecl *

serge-sans-paille marked an inline comment as done.Dec 11 2019, 12:53 PM
serge-sans-paille added inline comments.
clang/lib/AST/Decl.cpp
3049

What about isBuiltInWrapper() ?

clang/lib/AST/Decl.cpp
3049

WFM

@george.burgess.iv : take into account reviews, extra testing and function renaming.

Improve test case portability.

Just a few more nits, and this LGTM. Thanks again!

IDK if Eli wanted to have another look at this; please wait until tomorrow before landing to give him time to comment.

clang/lib/AST/Decl.cpp
3049

nit: If we're going with this naming, I think it's important to highlight that we're querying if this is a definition. So isInlineBuiltinDefinition()?

clang/test/CodeGen/memcpy-nobuiltin.c
6

Generally, I think we try to keep diag checking and codegen checking separate, but this is simple enough that I don't think it's a big deal.

Please use clang's -verify + expected-warning instead of CHECKs here.

Looks like, among others, test/CodeGen/const-init.c combines -verify with FileCheck'ing codegen.

serge-sans-paille marked 6 inline comments as done.

Handle comments.

serge-sans-paille marked an inline comment as done.Dec 13 2019, 12:50 AM
efriedma added inline comments.Dec 13 2019, 11:15 AM
clang/lib/AST/Decl.cpp
3062

I'm a little concerned about this; we're subtly changing our generated code based on the "system-headerness" of the definition. We generally avoid depending on this for anything other than warnings. It could lead to weird results with preprocessed source, or if someone specifies their include paths incorrectly.

serge-sans-paille marked an inline comment as done.Dec 13 2019, 1:30 PM
serge-sans-paille added inline comments.
clang/lib/AST/Decl.cpp
3062

With the extra check on builtin status, it may no longer be necessary, let me check that.

serge-sans-paille marked an inline comment as done.

Take comment into account.

@efriedma validation passes without the checks you were pointed out, see https://github.com/serge-sans-paille/llvm-project/pull/4/checks
Thanks for making the code simpler!

clang/lib/AST/Decl.cpp
3062

@efriedma I confirm this test is no longer needed.

@serge-sans-paille any blocker for this to land? Would be great to have it in version 10. Thanks

@sylvestre.ledru : I'd like a confirmation from @efriedma first!

efriedma accepted this revision.Thu, Jan 9, 2:58 PM

LGTM

This revision is now accepted and ready to land.Thu, Jan 9, 2:58 PM
This revision was automatically updated to reflect the committed changes.

This caused a linker error in chromium:

ld.lld: error: undefined symbol: __warn_memset_zero_len

Apparently now that the glibc memset is being used, __warn_memset_zero_len gets called from libc++ code (https://github.com/llvm/llvm-project/blob/b72a8c65e4e34779b6bc9e466203f553f5294486/libcxx/include/__bit_reference#L371).

A repro:

$ cat t.cpp
#include <bitset>
void useit(std::bitset<3> &it);
std::bitset<3> getit();
int main() {
  std::bitset<3> bits = getit();
  bits.set();
  useit(bits);
}

$ clang -S -O2 t.cpp  -o - -stdlib=libc++ -D_FORTIFY_SOURCE=1 | grep warn_mem
        callq   __warn_memset_zero_len
rnk added a subscriber: rnk.Wed, Jan 15, 3:29 PM

This caused a linker error in chromium:

ld.lld: error: undefined symbol: __warn_memset_zero_len

Apparently now that the glibc memset is being used, __warn_memset_zero_len gets called from libc++ code (https://github.com/llvm/llvm-project/blob/b72a8c65e4e34779b6bc9e466203f553f5294486/libcxx/include/__bit_reference#L371).

I can reproduce the behavior locally, but without the linker error. On my system, (RHEL7.5) libc_nonshared.a provides that symbol. I've checked on other systems and it's available in that library on Ubuntu 14.04.

On other systems, this symbol is not available, neither in the lib nor in the header, because the associated gcc version is high enough. Can you try to reproduce with:

clang -S -O2 -fgnuc-version=5.0 t.cpp  -o - -stdlib=libc++ -D_FORTIFY_SOURCE=1

@akhuang : I've reported https://sourceware.org/bugzilla/show_bug.cgi?id=25399 to glibc. According to me this is more of a glibc problem than this patch, and as we have a workaround with -fgnuc-version=5.0 I think we could apply the patch back.

(It's interesting to me if gcc doesn't warn about that libcxx code, since the whole point of the gnuc 5.0 check there was "the compiler should check this for us now"...)

If a glibc-side fix for this does materialize, IMO -fgnuc-version=5.0 isn't a good path forward.

It's best if workarounds are limited in scope, lifespan, and user impact:

  • Presumably -fgnuc-version=5.0 implies many more changes that are unrelated to this particular bug.
  • Glibc upgrades can take years to materialize for users[1].
  • A lot of packages build with -D_FORTIFY_SOURCE=2 by default. Requiring extra CPPFLAGS (either -D_FORTIFY_SOURCE=0 or -fgnuc-version=5.0) would break CC=clang make-style builds that already work today, if any statement in them folds to memset(x, non_zero, 0); after optimizations.

Looks like this landed on the branch, so unless I'm mistaken, we probably need to revert there, as well?

[1] -- Anecdotally, I upgraded a project's host toolchain from using glibc 2.15 to 2.17 last year. I wanted to do 2.19, but 2.17 usage was still large enough that 2.19 wasn't an option.

rnk added a comment.Thu, Jan 16, 10:27 AM

I agree, raising -fgnuc-version is not an acceptable workaround, and we can't ship a clang that doesn't work with old versions of glibc. I think we need a different workaround. The simplest one I can think of is to make __warn_memset_zero_len a recognized builtin that generates no code.

Clang already emits this warning when it would be useful, before inlining. memset(ptr, c, n) when n is zero at runtime or at compile time after inlining should not be a warning or link error.

I believe we also branched 10.0 between this change and its revert. Can you please follow up and merge the eventual fix to the release?

In D71082#1824563, @rnk wrote:

The simplest one I can think of is to make __warn_memset_zero_len a recognized builtin that generates no code.

+1, simple and easy.

I believe we also branched 10.0 between this change and its revert. Can you please follow up and merge the eventual fix to the release?

Sure.

The simplest one I can think of is to make __warn_memset_zero_len a recognized builtin that generates no code.

+1, simple and easy.

See https://reviews.llvm.org/D72869