This is an archive of the discontinued LLVM Phabricator instance.

[Clang][Codegen] Add GNU function attribute 'no_profile' and lower it to noprofile
ClosedPublic

Authored by nickdesaulniers on Jun 17 2021, 11:15 AM.

Details

Summary

noprofile IR attribute already exists to prevent profiling with PGO;
emit that when a function uses the newly added no_profile function
attribute.

The Linux kernel would like to avoid compiler generated code in
functions annotated with such attribute. We already respect this for
libcalls to fentry() and mcount().

Alternate implementation to: https://reviews.llvm.org/D104253
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223
Link: https://lore.kernel.org/lkml/CAKwvOdmPTi93n2L0_yQkrzLdmpxzrOR7zggSzonyaw2PGshApw@mail.gmail.com/

Diff Detail

Event Timeline

nickdesaulniers requested review of this revision.Jun 17 2021, 11:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2021, 11:15 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Should no_profile specify the inlining behavior? Though no_sanitize_* don't specify that, either.

As I commented on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223 , the Linux kernel specifies noinline so the exact inlining behavior doesn't matter.

#define noinstr                                                         \
        noinline notrace __attribute((__section__(".noinstr.text")))    \
        __no_kcsan __no_sanitize_address
clang/test/CodeGen/fprofile.c
19
// CHECK:     attributes [[ATTR2]] = {
// CHECK-NOT: noprofile
// CHECK:     }

otherwise the test may become stale if the order of ATTR and ATTR2 shuffles.

void added inline comments.Jun 17 2021, 11:45 AM
clang/include/clang/Basic/Attr.td
1974

Is this an attribute that exists in GCC? I couldn't find it...

clang/include/clang/Basic/AttrDocs.td
2565

s/function to declaration/function declaration/

Should no_profile specify the inlining behavior? Though no_sanitize_* don't specify that, either.

I think it's somehow implicit, but I can't quite say how. There are some tests that check this, e.g.
compiler-rt/test/asan/TestCases/inline.cpp

As I commented on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223 , the Linux kernel specifies noinline so the exact inlining behavior doesn't matter.

#define noinstr                                                         \
        noinline notrace __attribute((__section__(".noinstr.text")))    \
        __no_kcsan __no_sanitize_address

noinstr does add noinline, but other uses of the attribute alone might not. But in the end inlining is a red herring, it just seems to be the easiest way to enforce the promised contract. See https://lore.kernel.org/lkml/CANpmjNNRz5OVKb6PE7K6GjfoGbht_ZhyPkNG9aD+KjNDzK7hGg@mail.gmail.com/

In particular:

But the contract of
"no_sanitize" is "that a particular instrumentation or set of
instrumentations should not be applied". That contract is broken if a
function is instrumented, however that may happen.

Also always_inline needs checking, because it's a little special:

  • An always_inline function inlined into a no_sanitize function is not instrumented
  • An __always_inline function inlined into an instrumented function is instrumented

This was also previously discussed here: https://lore.kernel.org/lkml/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com/

nickdesaulniers marked 3 inline comments as done.
  • fix td files and test CHECKs
MaskRay added a comment.EditedJun 17 2021, 2:35 PM

Should no_profile specify the inlining behavior? Though no_sanitize_* don't specify that, either.

I think it's somehow implicit, but I can't quite say how. There are some tests that check this, e.g.
compiler-rt/test/asan/TestCases/inline.cpp
[...]
noinstr does add noinline, but other uses of the attribute alone might not. But in the end inlining is a red herring, it just seems to be the easiest way to enforce the promised contract. See https://lore.kernel.org/lkml/CANpmjNNRz5OVKb6PE7K6GjfoGbht_ZhyPkNG9aD+KjNDzK7hGg@mail.gmail.com/
[...]

Thanks for the comments. Recently there is a discussion about LLVM IR function attribute nointeropt (similar to a debugging-only GCC function attribute noipa).
People tend to think attributes should be orthogonal. I think not suppressing inlining for the IR attribute no_profile is nice.
Ideally the GNU function attribute (C/C++) does not diverge much from the IR semantics.
So if we don't want to offer guarantee for IR/C/C++ attributes, we can document that users may need to additionally specify __attribute__((noinline)) to suppress inlining.

melver added a comment.EditedJun 17 2021, 2:59 PM

Should no_profile specify the inlining behavior? Though no_sanitize_* don't specify that, either.

I think it's somehow implicit, but I can't quite say how. There are some tests that check this, e.g.
compiler-rt/test/asan/TestCases/inline.cpp
[...]
noinstr does add noinline, but other uses of the attribute alone might not. But in the end inlining is a red herring, it just seems to be the easiest way to enforce the promised contract. See https://lore.kernel.org/lkml/CANpmjNNRz5OVKb6PE7K6GjfoGbht_ZhyPkNG9aD+KjNDzK7hGg@mail.gmail.com/
[...]

Thanks for the comments. Recently there is a discussion about LLVM IR function attribute nointeropt (similar to a debugging-only GCC function attribute noipa).
People tend to think attributes should be orthogonal. I think not suppressing inlining for the IR attribute no_profile is nice.
Ideally the GNU function attribute (C/C++) does not diverge much from the IR semantics.
So if we don't want to offer guarantee for IR/C/C++ attributes, we can document that users may need to additionally specify __attribute__((noinline)) to suppress inlining.

As long as it remains practical. I don't want us ending up with a mess of an attribute like we once had for KASAN: __no_kasan_or_inline -- because static inline __no_sanitize_address foo(void) { ... } did the wrong thing, because a) attribute was incompatible with 'inline', and b) caused instrumentation if it was inlined (AFAIK that behaviour is fixed and was a GCC-only bug).

Essentially, as long as no_foo means that code in function where no_foo is specified is never instrumented by foo, then we're good. If that code is inlined or not doesn't matter, as long as the contract is honored. If that's not the case we have a problem because we need ugly workarounds for what are essentially compiler bugs (contract not honored).

So if we don't want to offer guarantee for IR/C/C++ attributes, we can document that users may need to additionally specify __attribute__((noinline)) to suppress inlining.

I don't generally like that approach:

  1. it's not easy for developers to validate their call call chains to ensure that a caller with a restrictive function attribute doesn't have unrestricted callers inlined into the restricted caller.
  2. that behavior opposes the principle of least surprise.
  3. We don't have a statement attribute for call sites to say "please don't inline this call" which would be fine grain. noinline is a course-grain hammer; what if we want to inline a callee into most callers but not this one caller that has such a restricted function attribute?

See also D91816 where I took the conservative approach for no_stack_protector and simply chose not to perform such inline substitution on caller and callee function attribute mismatch. I find this behavior to be the least surprising, and the developer is provided with introspection as to why the compile did not perform such inlining via -Rpass=inline flag.

So if we don't want to offer guarantee for IR/C/C++ attributes, we can document that users may need to additionally specify __attribute__((noinline)) to suppress inlining.

I don't generally like that approach:

  1. it's not easy for developers to validate their call call chains to ensure that a caller with a restrictive function attribute doesn't have unrestricted callers inlined into the restricted caller.
  2. that behavior opposes the principle of least surprise.
  3. We don't have a statement attribute for call sites to say "please don't inline this call" which would be fine grain. noinline is a course-grain hammer; what if we want to inline a callee into most callers but not this one caller that has such a restricted function attribute?

See also D91816 where I took the conservative approach for no_stack_protector and simply chose not to perform such inline substitution on caller and callee function attribute mismatch. I find this behavior to be the least surprising, and the developer is provided with introspection as to why the compile did not perform such inlining via -Rpass=inline flag.

Agree. I think we just raced to say the same thing. ;-)

MaskRay added a comment.EditedJun 17 2021, 3:26 PM

So if we don't want to offer guarantee for IR/C/C++ attributes, we can document that users may need to additionally specify __attribute__((noinline)) to suppress inlining.

I don't generally like that approach:

If a guarantee would cause inferior optimization or we cannot think of all the complication/fallout initially, I'd prefer that we keep the initial semantics narrow.
If we cannot make a promise, don't provide a promise.
No-suppressing inlining makes the attribute freely composable with other attributes (https://lists.llvm.org/pipermail/llvm-dev/2021-April/150062.html )

  1. it's not easy for developers to validate their call call chains to ensure that a caller with a restrictive function attribute doesn't have unrestricted callers inlined into the restricted caller.
  2. that behavior opposes the principle of least surprise.

Disagree. If the user wants a function not to be inlined, let them add __attribute__((noinline)).

  1. We don't have a statement attribute for call sites to say "please don't inline this call" which would be fine grain. noinline is a course-grain hammer; what if we want to inline a callee into most callers but not this one caller that has such a restricted function attribute?

See also D91816 where I took the conservative approach for no_stack_protector and simply chose not to perform such inline substitution on caller and callee function attribute mismatch. I find this behavior to be the least surprising, and the developer is provided with introspection as to why the compile did not perform such inlining via -Rpass=inline flag.

I think D91816 was an unfortunate case. That would require the inliner to understand every possible attribute. That was the practical approach back then because it is not easy for the kernel to cope.
For new things we should strive for making the kernel do the right thing.

For example, if a kernel function can sometimes be inlinable, sometimes not: add a alias and add __attribute__((noinline)) to that alias.

So if we don't want to offer guarantee for IR/C/C++ attributes, we can document that users may need to additionally specify __attribute__((noinline)) to suppress inlining.

I don't generally like that approach:

If a guarantee would cause inferior optimization or we cannot think of all the complication/fallout initially, I'd prefer that we keep the initial semantics narrow.

The guarantee is more important than the optimization.

If we cannot make a promise, don't provide a promise.
No-suppressing inlining makes the attribute freely composable with other attributes (https://lists.llvm.org/pipermail/llvm-dev/2021-April/150062.html )

IIUC, isn't @jdoerfert arguing that optnone and noinline being composable is _bad_?

  1. it's not easy for developers to validate their call call chains to ensure that a caller with a restrictive function attribute doesn't have unrestricted callers inlined into the restricted caller.
  2. that behavior opposes the principle of least surprise.

Disagree. If the user wants a function not to be inlined, let them add __attribute__((noinline)).

  1. We don't have a statement attribute for call sites to say "please don't inline this call" which would be fine grain. noinline is a course-grain hammer; what if we want to inline a callee into most callers but not this one caller that has such a restricted function attribute?

See also D91816 where I took the conservative approach for no_stack_protector and simply chose not to perform such inline substitution on caller and callee function attribute mismatch. I find this behavior to be the least surprising, and the developer is provided with introspection as to why the compile did not perform such inlining via -Rpass=inline flag.

I think D91816 was an unfortunate case. That would require the inliner to understand every possible attribute.

Hyperbole. Only a few such function attributes would need such restrictions.

That was the practical approach back then because it is not easy for the kernel to cope.
For new things we should strive for making the kernel do the right thing.

For example, if a kernel function can sometimes be inlinable, sometimes not: add a alias and add __attribute__((noinline)) to that alias.

Interesting idea, but noinline is straight up broken in LLVM: https://godbolt.org/z/MoE1a7c3v. The Linux kernel relies on noinline meaning don't perform inline substitution. That violates the principal of least surprise. When a developer uses noinline, no_stack_protector, or no_profile, they mean DO NOT inline, insert a stack protector, or insert profiling/coverage instrumentation EVER.

Also debugging to find out that inlining was the cause of such violated expectations sucks.

This is missing all of the usual semantic tests (attribute accepts no args, applies only to functions, etc).

clang/include/clang/Basic/Attr.td
1975

Should this also be supported on ObjC methods?

clang/include/clang/Basic/AttrDocs.td
2566
nickdesaulniers marked an inline comment as done.
  • add hyphen to docs, add boilerplate test to check Subjects.
clang/include/clang/Basic/Attr.td
1975

Wouldn't that be the case for many surrounding attributes defined near by? I don't see why not, but I also know literally nothing about ObjC. Would you like me to extend this attribute to ObjC methods?

aaron.ballman accepted this revision.Jun 18 2021, 11:31 AM

LGTM!

clang/include/clang/Basic/Attr.td
1975

It would be, and I think many of them are in the same state for the same reason (not knowing about ObjC).

I don't insist on the change. I ask whenever I see Function in the subject list because it's pretty rare that ObjC methods work differently than a function call in ways that matter for attributes. We can always add the functionality when a user requests it, later.

This revision is now accepted and ready to land.Jun 18 2021, 11:31 AM
void accepted this revision.Jun 18 2021, 11:58 AM

So if we don't want to offer guarantee for IR/C/C++ attributes, we can document that users may need to additionally specify __attribute__((noinline)) to suppress inlining.

I don't generally like that approach:

If a guarantee would cause inferior optimization or we cannot think of all the complication/fallout initially, I'd prefer that we keep the initial semantics narrow.

The guarantee is more important than the optimization.

If we cannot make a promise, don't provide a promise.
No-suppressing inlining makes the attribute freely composable with other attributes (https://lists.llvm.org/pipermail/llvm-dev/2021-April/150062.html )

IIUC, isn't @jdoerfert arguing that optnone and noinline being composable is _bad_?

No, the opposite. @jdoerfert is arguing that attributes should be composable.
I agree and I think the optnone design has some orthogonality problems.

  1. it's not easy for developers to validate their call call chains to ensure that a caller with a restrictive function attribute doesn't have unrestricted callers inlined into the restricted caller.
  2. that behavior opposes the principle of least surprise.

Disagree. If the user wants a function not to be inlined, let them add __attribute__((noinline)).

  1. We don't have a statement attribute for call sites to say "please don't inline this call" which would be fine grain. noinline is a course-grain hammer; what if we want to inline a callee into most callers but not this one caller that has such a restricted function attribute?

See also D91816 where I took the conservative approach for no_stack_protector and simply chose not to perform such inline substitution on caller and callee function attribute mismatch. I find this behavior to be the least surprising, and the developer is provided with introspection as to why the compile did not perform such inlining via -Rpass=inline flag.

I think D91816 was an unfortunate case. That would require the inliner to understand every possible attribute.

Hyperbole. Only a few such function attributes would need such restrictions.

That was the practical approach back then because it is not easy for the kernel to cope.
For new things we should strive for making the kernel do the right thing.

For example, if a kernel function can sometimes be inlinable, sometimes not: add a alias and add __attribute__((noinline)) to that alias.

Interesting idea, but noinline is straight up broken in LLVM: https://godbolt.org/z/MoE1a7c3v. The Linux kernel relies on noinline meaning don't perform inline substitution. That violates the principal of least surprise. When a developer uses noinline, no_stack_protector, or no_profile, they mean DO NOT inline, insert a stack protector, or insert profiling/coverage instrumentation EVER.

Also debugging to find out that inlining was the cause of such violated expectations sucks.

It is not broken. That is constant propagation in early-cse. A function call can be optimized out if it has no side effect and its value can be determined. The behavior may look like inlining. That is the nointeropt IR attribute proposal intends to address.
I guess for most(all?) non-trivial noinline usage in the Linux kernel, we can assume noinline is fine.

MaskRay accepted this revision.Jun 18 2021, 12:09 PM
MaskRay added a subscriber: marxin.

LG.

@marxin FYI the GNU-style function attribute no_profile

MaskRay added inline comments.Jun 18 2021, 12:10 PM
clang/test/CodeGen/fprofile.c
2

no_profile may be a better test name.

nickdesaulniers retitled this revision from [Clang][Codegen] emit noprofile IR Fn Attr for new Fn Attr no_profile to [Clang][Codegen] Add GNU function attribute 'no_profile' and lower it to noprofile.Jun 18 2021, 12:13 PM
  • rename test, add __has_attribute unit test
phosek accepted this revision.Jun 18 2021, 12:37 PM

LGTM I'm happy to see that the noprofile LLVM attribute is useful in other contexts.

This revision was landed with ongoing or failed builds.Jun 18 2021, 1:42 PM
This revision was automatically updated to reflect the committed changes.

rG817218336aa3e3c0ca422ae00f8d8ca41b8fbd6d follow up based on reports of test failure on mac.

It looks like GCC has had no_profile_instrument_function since GCC 7.1 for this purpose.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223#c11

Follow up: https://reviews.llvm.org/D104658.