Traditionally, to force some inlining decisions one has to annotate function declarations with __attribute__((always_inline)) or __attribute__((noinline)). One problem with these attributes is that they affect every call site. Always inlining or forbidding inlining may not be desirable in every context, and a workaround for that is creating a few copies of functions, each with a different inline attribute. Furthermore, it's not always feasible (in every project) to modify library code and introduce new function attributes there.
This patch introduces a new way of forcing inlining decisions on a per-callsite basis. This allows for more fine-grained control over inlining, without creating any duplicate functions. The two new intrinsics for controlling inlining are:
- __builtin_always_inline(Foo()) -- inlines the function Foo at the callsite, if possible. Internally, this applies the alwaysinline attribute to the generated call instruction.
- __builtin_no_inline(Foo()) -- forbids the function Foo to be inlined at the callsite. Internally, this applies the noinline attribute to the generated call instruction.
The inline intrinsics support function, function pointer, member function, member function pointer, virutal function, and operator calls. Support for constructor calls (CXXTemporaryExpr) should also be possible, but is not the part of this patch.
I'd expect to be able to write
without caring whether sqrt was a real function or just a macro around __builtin_sqrt. How important is it that calls to builtin functions be errors, instead of just being ignored for this purpose?