Page MenuHomePhabricator

Avoid simplification of library functions when callee has an implementation
Needs RevisionPublic

Authored by siddhesh on Nov 17 2020, 8:12 PM.

Details

Summary

During simplification, a library function with an implementation body would mean that it is not a standard library call and hence simplification should be avoided.

This fixes code generation for bcopy when built with -D_FORTIFY_SOURCE since the simplification would otherwise transform the bcopy calls to memmove and not builtin_memmove_chk as is expected with source fortification.

Diff Detail

Event Timeline

siddhesh created this revision.Nov 17 2020, 8:12 PM
siddhesh requested review of this revision.Nov 17 2020, 8:12 PM

Sorry, I think I ran only check-llvm. I'll look at the test breakage.

The problem you're trying to solve doesn't seem specific to bcopy: any function with an inline definition should probably *not* be replaced, and the inlined definition should be favored. This was the spirit of that patch https://reviews.llvm.org/D71082. i wonder why your test case doesn't fall into that configuration.

The problem you're trying to solve doesn't seem specific to bcopy: any function with an inline definition should probably *not* be replaced, and the inlined definition should be favored. This was the spirit of that patch https://reviews.llvm.org/D71082. i wonder why your test case doesn't fall into that configuration.

Maybe because that D71082 is limited to clang? The bcopy survives clang but llvm forces it into a memmove without this patch. Would you like to see this change be more generic, i.e. for any libcall, if the callee has a function body that has a call instruction in its first BB, avoid optimizing it?

! In D91677#2401785, @siddhesh wrote:
Maybe because that D71082 is limited to clang? The bcopy survives clang but llvm forces it into a memmove without this patch.

ok, got it. bcopy is somehow special here, because it doesn't have an associated llvm builtin like memcpy has.

Would you like to see this change be more generic, i.e. for any libcall, if the callee has a function body that has a call instruction in its first BB, avoid optimizing it?

I'd go even further: if the callee has a function body and is known has a library function, then don't optimize it because there's a provided implementation. I don't think the requirement of a call instruction in the body should be mandatory: the fortified version could provide, say, an inlined implementation ? Or maybe some setup code before calling the _chk version?

siddhesh updated this revision to Diff 306133.Nov 18 2020, 9:01 AM
siddhesh retitled this revision from Avoid transforming fortified bcopy to memmove to Avoid simplification of library functions when callee has an implementation.
siddhesh edited the summary of this revision. (Show Details)

Made the change more generic, avoiding simplification if the callee has an implementation, i.e. its body is not empty.

lebedev.ri requested changes to this revision.Nov 18 2020, 9:07 AM
lebedev.ri added a subscriber: lebedev.ri.
lebedev.ri added inline comments.
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
3023–3027

Isn't the problem that you need to mark your @bcopy with nobuiltin attribute?

3076–3079

And if it happens to be in another TU?

This revision now requires changes to proceed.Nov 18 2020, 9:07 AM
siddhesh added inline comments.Nov 18 2020, 9:21 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
3023–3027

nobuiltin is clang-specific and to make this effective, all inline versions of bcopy would have to be marked nobuiltin. There's no precedence for that in glibc, I'm not sure about other libraries.

3076–3079

It does the simplification; it is not relevant for fortify-source since those implementations are always inline.

And by mark them, you could also use -fno-builtin=???, see D68028.

And by mark them, you could also use -fno-builtin=???, see D68028.

That would imply having to always add -fno-builtin along with -D_FORTIFY_SOURCE, which seems even worse than adding __attribute__((no_builtin)) to fortified bcopy implementations in libc since it would have to be done for all sources when building with llvm just so that it does not optimize away fortifications.

Could you elaborate on why you would like to keep the option to simplify calls to library functions with implementations by default and not the other way around as this patch proposes?

Interestingly, gcc and clang have different behavior here: https://godbolt.org/z/hdadhY
And clang already honors the user-defined bcopy why is -D_FORTIFY_SOURCE different from the example here?

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
3076–3079

Would it make sense to check that the function is marked inline?
Redefining a libc function locally is probably not a good thing?

Interestingly, gcc and clang have different behavior here: https://godbolt.org/z/hdadhY

Hmm, looks like I opened a pandora's box box with this one; the behaviour seems to be very different for different ways in which one implements bcopy and some of them seem incorrect for both compilers.

Without <string.h> for example, both compilers converge and ignore the locally defined bcopy and replace the call with a memmove. This seems incorrect but I'll take a closer look. It shouldn't impact this patch though since the bcopy signature is different. I can restrict the change to non-empty *and* inline if it makes sense.

And clang already honors the user-defined bcopy why is -D_FORTIFY_SOURCE different from the example here?

I suspect clang is doing something weird there because it doesn't actually inline the bcopy; it simply removes the call; see: https://godbolt.org/z/PEssMe . I think this will be avoided altogether if bcopy simplification is avoided whenever a local definition is available. Even if the local definition is preemptable, it is always going to be selected first and the compiler should not assume that it will behave the same as the library bcopy.

That said, the _FORTIFY_SOURCE definition is not just an inline hint, it is: extern inline __attribute__((always_inline)) __attribute__((__gnu_inline__)). This should ensure that bcopy is inlined, except that clang doesn't actually inline it and replaces it with a memmove instead. This one shows more clearly how the behaviour is different between gcc and llvm for fortify-source: https://godbolt.org/z/3baWfY

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
3076–3079

Strictly speaking, I think the local function should always get precedence but if for this patch we want to limit scope, we could limit this to non-empty, inline functions.

Redefining a libc function locally is not uncommon; in fact it is a common way to build an interposition library and take advantage of symbol resolution precedence.

siddhesh added inline comments.Nov 19 2020, 5:54 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
3076–3079

Redefining a libc function locally is not uncommon; in fact it is a common way to build an interposition library and take advantage of symbol resolution precedence.

I misspoke here; there is a subtle difference between the idea of the interposition library and interposition/overriding of functions. I meant to specifically refer to overriding of functions here, i.e. making sure that the current TU uses the local version of the function and not the global one. It is not prohibited or even discouraged.