This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Add __builtin_addressof_nocfi
ClosedPublic

Authored by samitolvanen on Aug 20 2021, 11:59 AM.

Details

Summary

This change adds builtin_addressof_nocfi(). This built-in function
is similar to the existing
builtin_addressof(), except that with
-fsanitize=cfi, it returns the address of the function body instead of
a CFI jump table address.

__builtin_addressof_nocfi() is helpful in low-level code, such as
operating system kernels, which may need the address of a specific
function without a jump table indirection.

Link: https://github.com/ClangBuiltLinux/linux/issues/1353

Depends on D108478

Diff Detail

Event Timeline

samitolvanen created this revision.Aug 20 2021, 11:59 AM
samitolvanen requested review of this revision.Aug 20 2021, 11:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 20 2021, 11:59 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Here's a PoC of the built-in function that returns the address of the function body with CFI. Based on D108478.

clang/lib/CodeGen/CGExprConstant.cpp
1893

fix linter warning

1894

Don't use auto here.
https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable
It's common when the type is already specified on the RHS of an assignment that's using of the casting operators that's already explicitly specialized with the type.

clang/test/CodeGen/builtin-addressof-nocfi.c
18 ↗(On Diff #367850)

do we ever need the builtin address of a global that's NOT a function?

If so, we should add a test for that. If not, perhaps we don't need to waste space in every APValue?

martong removed a subscriber: martong.Aug 27 2021, 1:29 AM
samitolvanen marked 2 inline comments as done.

Fixed clang-tidy warnings, dropped an unnecessary auto.

clang/test/CodeGen/builtin-addressof-nocfi.c
18 ↗(On Diff #367850)

do we ever need the builtin address of a global that's NOT a function?

I don't think so. This version does accept any global because it was modelled after __builtin_addressof(), but we could look into limiting this only for functions. Perhaps the built-in name should also be changed in that case?

If so, we should add a test for that. If not, perhaps we don't need to waste space in every APValue?

What would be a better place for the flag? I think in Clang, changing this to support only functions would probably just need some additional Sema checks.

pcc accepted this revision.Oct 27 2021, 5:21 PM

LGTM

clang/lib/AST/APValue.cpp
133 ↗(On Diff #369795)

Remove braces

This revision is now accepted and ready to land.Oct 27 2021, 5:21 PM
clang/test/CodeGen/builtin-addressof-nocfi.c
2 ↗(On Diff #369795)

mind adding a test case to this file that:

#if !__has_builtin(__builtin_addressof_nocfi)
#error "where did my builtin go?"
#endif
clang/lib/AST/APValue.cpp
44–46 ↗(On Diff #369795)

I think a common convention in LLVM is to have comments like:

Local{I, V, /*NoCFIValue=*/false}`

for boolean parameters.

https://llvm.org/docs/CodingStandards.html#comment-formatting

clang/include/clang/AST/APValue.h
195 ↗(On Diff #369795)

Is there be padding off the end of the bitfield? Or does this actually change the sizeof(LocalState)?

https://godbolt.org/z/eEdKoaab9

Can we just use a non-bitfield bool here?

Addressed comments.

samitolvanen marked 2 inline comments as done.

Missed a comment.

samitolvanen marked 2 inline comments as done.Oct 29 2021, 3:42 PM
ardb added a subscriber: ardb.Nov 4 2021, 3:05 AM

I would argue that the existing __builtin_addressof() should absorb this behavior, rather than adding a special builtin related to CFI.

As is documented for __builtin_addressof(), its intended use is in cases where the & operator may return something other than the address of the object, which is exactly the case we are dealing with here.

clang/test/CodeGen/builtin-addressof-nocfi.c
18 ↗(On Diff #367850)

do we ever need the builtin address of a global that's NOT a function?

I don't think so. This version does accept any global because it was modelled after __builtin_addressof(), but we could look into limiting this only for functions. Perhaps the built-in name should also be changed in that case?

Why is that necessary? That will only make it harder to use for users that want to wrap this in another macro that may be used for arbitrary symbols.

pcc requested changes to this revision.Nov 4 2021, 2:02 PM
pcc added subscribers: rjmccall, rsmith.

I would argue that the existing __builtin_addressof() should absorb this behavior, rather than adding a special builtin related to CFI.

As is documented for __builtin_addressof(), its intended use is in cases where the & operator may return something other than the address of the object, which is exactly the case we are dealing with here.

Maybe. It does imply two slightly orthogonal use cases for __builtin_addressof, one for implementing std::addressof and the other for the no-CFI use case. One question is then whether these two use cases will conflict. In libc++ we have

template <class _Tp>
inline _LIBCPP_CONSTEXPR_AFTER_CXX14
_LIBCPP_NO_CFI _LIBCPP_INLINE_VISIBILITY
_Tp*
addressof(_Tp& __x) _NOEXCEPT
{
    return __builtin_addressof(__x);
}

So as long as the additional use case for __builtin_addressof is only activated when the argument is (syntactically) a reference to a function I don't think this libc++ code will be affected. I guess it's possible that there could be other users of __builtin_addressof that could be relying on the existing behavior, although it seems unlikely to me.

A similar use case that we may want to consider is for removing the pointer authentication bits when using the proposed PAuth ABI. D112941 proposes adding a separate builtin __builtin_ptrauth_strip for this purpose, although I believe that it cannot be used in constant expressions. If we decide to let __builtin_addressof absorb the no-CFI behavior then it seems reasonable for it to absorb the PAuth stripping behavior as well, but again only if the argument is a syntactic function reference. I personally would find this somewhat confusing though because stripping in the PAuth ABI can be performed at runtime on any pointer value and not just syntactic function references.

Should the builtin return a value of type void* instead of a pointer to the type of its argument? This would make it clear that the value returned by the builtin is not meant to be called in the usual way (this applies to both CFI and the PAuth ABI) and I suppose is another point in favor of this being a separate builtin.

Adding @rsmith and @rjmccall who may have opinions on the above.

This revision now requires changes to proceed.Nov 4 2021, 2:02 PM

std::addressof(&someFunction) certainly ought to return a signed pointer under ptrauth, so if your goal here is to get a completely unadorned symbol address, I think you do need a different builtin, and it probably ought to return a void* to emphasize that it shouldn't be used as a normal value. Maybe it should even be semantically restricted to require a constant decl reference as its operand? Related and perhaps illuminating question: if it were implementable, would you also want to force the suppression of lazy-binding thunks and/or decorations like the THUMB bit?

Keep a void * return type for the nocfi variant.

pcc added a comment.Nov 5 2021, 12:09 PM

Maybe it should even be semantically restricted to require a constant decl reference as its operand?

I think we should do this.

Maybe it should be named something like __builtin_symbol_address if we're intending for this to have an effect with PAuth as well?

Yeah, I think that's a better name. The documentation can say that ideally this also wouldn't include things like the THUMB bit, but there are technical limitations that make it hard to achieve that.

You could also make this Just Work for things like C++ member functions rather than producing a member function pointer.

samitolvanen planned changes to this revision.Nov 5 2021, 3:59 PM
samitolvanen removed subscribers: rsmith, rjmccall.

You could also make this Just Work for things like C++ member functions rather than producing a member function pointer.

I'm not sure I understand. What does Just Work mean when it comes to C++ member functions?

(Adding back @rsmith, @rjmccall.)

You could also make this Just Work for things like C++ member functions rather than producing a member function pointer.

I'm not sure I understand. What does Just Work mean when it comes to C++ member functions?

I think he means that e.g. __builtin_symbol_address(&Foo::bar) should return a void* pointing to the address of the Foo::bar member function body, instead of a member function pointer for Foo::bar.

(Adding back @rsmith, @rjmccall.)

You could also make this Just Work for things like C++ member functions rather than producing a member function pointer.

I'm not sure I understand. What does Just Work mean when it comes to C++ member functions?

I think he means that e.g. __builtin_symbol_address(&Foo::bar) should return a void* pointing to the address of the Foo::bar member function body, instead of a member function pointer for Foo::bar.

Yes, exactly.

jrtc27 added a subscriber: jrtc27.Nov 13 2021, 7:07 PM

std::addressof(&someFunction) certainly ought to return a signed pointer under ptrauth, so if your goal here is to get a completely unadorned symbol address, I think you do need a different builtin, and it probably ought to return a void* to emphasize that it shouldn't be used as a normal value. Maybe it should even be semantically restricted to require a constant decl reference as its operand? Related and perhaps illuminating question: if it were implementable, would you also want to force the suppression of lazy-binding thunks and/or decorations like the THUMB bit?

Similarly, should it point to the descriptor or the entry point for ABIs with function descriptors? Personally I think trying to generalise this builtin just opens a huge can of worms when you look at other architectures and ABIs (and what it does can have implications for CHERI/Morello where we have various weird experimental ABIs), so you may just be best off having the original specialised builtin with very clear semantics.

I don't know for certain, but I would guess that the kernel wants to get the address of the first instruction in the function for the purposes of some sort of later PC-based table lookup, which means that yes, it probably *does* need to bypass descriptors on CHERI / Itanium / whatever else.

What we're trying to do here is *avoid* a can of worms by clearly understanding what the desired semantics are, rather than adding a builtin with the semantics of "ignore exactly the one complication that we ran into first".

If it's bypassing the descriptors then __builtin_symbol_address is the wrong name (and a bit ambiguous). As far as dlsym is concerned, the symbol is the descriptor, but when you get down to the ELF representation itself that's not always true. For PPC64 ELFv1, the ELF symbol is the descriptor, and the entry point has a different name. For PA-RISC and Itanium, the ELF symbol is the entry point, and you request the descriptor rather than the entry point by using a different relocation to the normal data pointer one (well, Itanium has a whole set of them, you have {32,64} x {LSB,MSB} plus a 64I one for putting into an X format instruction's immediate, and GP-relative GOT-indirect (@ltoff) versions of all those, plus a bonus 22-bit immediate one for that).

For CHERI there's the added complication that descriptors and trampolines can exist for security reasons when crossing security domains, and you absolutely should not let one compartment get pointers to the entry point of another compartment's function. You can hand it out if sealed or the permissions are cleared, as then you can't really do anything with it other than look at the integer address, but that seems a bit odd.

For CHERI there's the added complication that descriptors and trampolines can exist for security reasons when crossing security domains, and you absolutely should not let one compartment get pointers to the entry point of another compartment's function. You can hand it out if sealed or the permissions are cleared, as then you can't really do anything with it other than look at the integer address, but that seems a bit odd.

That would be consistent with getting an unsigned pointer under pointer authentication or an address with the THUMB bit potentially stripped: it's just a raw address that you can't safely call. In any case, I suspect it would be fine for us to say that we just don't support this builtin when the target is using weird function pointers unless we have some way to bypass the special treatment in LLVM.

I agree that "symbol" address is probably the wrong name. Maybe __builtin_function_start or something like that? But before we go much further on this, we should get confirmation from Peter that we're targeting the right design.

pcc added a comment.Nov 15 2021, 11:07 AM

For CHERI there's the added complication that descriptors and trampolines can exist for security reasons when crossing security domains, and you absolutely should not let one compartment get pointers to the entry point of another compartment's function. You can hand it out if sealed or the permissions are cleared, as then you can't really do anything with it other than look at the integer address, but that seems a bit odd.

That would be consistent with getting an unsigned pointer under pointer authentication or an address with the THUMB bit potentially stripped: it's just a raw address that you can't safely call. In any case, I suspect it would be fine for us to say that we just don't support this builtin when the target is using weird function pointers unless we have some way to bypass the special treatment in LLVM.

I agree that "symbol" address is probably the wrong name. Maybe __builtin_function_start or something like that? But before we go much further on this, we should get confirmation from Peter that we're targeting the right design.

Having this be defined to return the function body address (and diagnose at compile time if that's not supported) seem like the right design to me, and __builtin_function_start sounds like a good name.

Here's another example of a potential use case that came up in the course of the PAuth ABI work: testing whether a stack trace contains a particular function:
https://cs.android.com/android/platform/superproject/+/master:bionic/tests/execinfo_test.cpp;l=94

I think that no matter whether it's CFI, PAuth, CHERI or anything else, this test would want the address of the real function body. If the test is compiled on a platform where taking the real function body address is unsupported, it would fail at runtime if we just let this be compiled as a no-op, so we should diagnose the issue at compile time where possible.

clang/test/SemaCXX/builtins.cpp
44

I don't think this should always evaluate to true, should it? Maybe we should forbid these types of comparisons in integral constant expressions?

Do any of the proposed use cases actually require this to be a constant expression? Some of these patterns can be cheaply implemented with code generation even if the target doesn't otherwise support constants; for example, we could just mask the THUMB bit off on 32-bit ARM.

If we do need to support constant expressions of this, I think we should have the constant evaluator produce an expression rather than an l-value, the way we do with some other builtin calls. That should stop the comparison problem more generally.

Renamed to __builtin_function_start, allowed only FunctionDecls as a parameter, added support for C++ member functions, and disallowed comparisons in integral constant expressions.

samitolvanen marked an inline comment as done.Nov 18 2021, 9:52 AM

If we do need to support constant expressions of this

Yes, we need this also in constant expressions.

I think we should have the constant evaluator produce an expression rather than an l-value, the way we do with some other builtin calls. That should stop the comparison problem more generally.

Sure, I can take a look at how that would work. Basically, in PointerExprEvaluator::VisitBuiltinCallExpr we should not evaluate the l-value and just leave it at Result.set(E)?

If we do need to support constant expressions of this

Yes, we need this also in constant expressions.

Okay. I assume just static initializers, and not things like template arguments?

I think we should have the constant evaluator produce an expression rather than an l-value, the way we do with some other builtin calls. That should stop the comparison problem more generally.

Sure, I can take a look at how that would work. Basically, in PointerExprEvaluator::VisitBuiltinCallExpr we should not evaluate the l-value and just leave it at Result.set(E)?

Yes, exactly. Since the builtin already requires a constant operand in non-dependent contexts, that should be enough.

clang/lib/Sema/SemaChecking.cpp
199

Please update the function name here.

208

It would be more general to allow any expression that we can constant-evaluate to a specific function / member function reference. That allows callers to do stuff like __builtin_function_start((int (A::*)() const) &A::x) to resolve overloaded function references.

You should delay this check if the operand is value-dependent.

221

I think we decided that the result type should be void*. Are there other semantic checks from CheckAddressOfOperand that still meaningfully apply?

samitolvanen planned changes to this revision.Nov 18 2021, 12:50 PM

If we do need to support constant expressions of this

Yes, we need this also in constant expressions.

Okay. I assume just static initializers, and not things like template arguments?

Correct.

samitolvanen marked an inline comment as done.

Refactored to avoid evaluating the expression into an l-value.

Sure, I can take a look at how that would work. Basically, in PointerExprEvaluator::VisitBuiltinCallExpr we should not evaluate the l-value and just leave it at Result.set(E)?

Yes, exactly. Since the builtin already requires a constant operand in non-dependent contexts, that should be enough.

This does indeed solve comparison issues and allows me to drop the changes to APValue, but it breaks initializing globals in C because Clang doesn't know the expression is a compile-time constant:

$ cat test.c
void a() {}
const void *p = __builtin_function_start(a);
$ clang -c test.c
test.c:2:17: error: initializer element is not a compile-time constant
const void *p = __builtin_function_start(a);
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

I worked around this for now by explicitly allowing __builtin_function_start in CheckLValueConstantExpression, but this seems terribly hacky. What would be the correct way to solve this issue?

clang/lib/Sema/SemaChecking.cpp
208

It would be more general to allow any expression that we can constant-evaluate to a specific function / member function reference. That allows callers to do stuff like __builtin_function_start((int (A::*)() const) &A::x) to resolve overloaded function references.

I looked into using Expr::EvaluateAsConstantExpr here and while it works, I'm not sure if allowing arbitrary expressions as the argument provides any value. We can allow resolving overloaded function references without constant-evaluating the expression (and I added tests for this). Did you have any other use cases in mind where this might be useful?

221

I thought Sema::checkAddressOfFunctionIsAvailable would be useful.

I worked around this for now by explicitly allowing __builtin_function_start in CheckLValueConstantExpression, but this seems terribly hacky. What would be the correct way to solve this issue?

Try to generalize what we do for __builtin___CFStringMakeConstantString.

clang/lib/Sema/SemaChecking.cpp
208

I don't see what the advantage of limiting the constant expression would be if we can constant-evaluate it. switch doesn't force you to make case values be integer literals and/or references to enumerators. What are you trying to achieve with a restriction?

Not having arbitrary restrictions is particularly useful in C++, where templates and constexpr machinery can usefully do a lot of abstraction.

samitolvanen planned changes to this revision.Nov 23 2021, 10:33 AM

I worked around this for now by explicitly allowing __builtin_function_start in CheckLValueConstantExpression, but this seems terribly hacky. What would be the correct way to solve this issue?

Try to generalize what we do for __builtin___CFStringMakeConstantString.

Thanks, I'll take a look.

clang/lib/Sema/SemaChecking.cpp
208

I don't see what the advantage of limiting the constant expression would be if we can constant-evaluate it. switch doesn't force you to make case values be integer literals and/or references to enumerators. What are you trying to achieve with a restriction?

I'm trying to understand the benefit of allowing arbitrary expressions like __builtin_function_start(100 + a), which EvaluateAsConstantExpr is happy to evaluate into a reference to a.

I can obviously understand why these should be allowed for switch, but here we have a function whose sole purpose is to return the address of a function and I'm wondering why someone would want pass anything more complicated to it.

Not having arbitrary restrictions is particularly useful in C++, where templates and constexpr machinery can usefully do a lot of abstraction.

Perhaps I'm just not that familiar with the C++ use cases here. Would you be able to provide me an example I could use as a test case?

rjmccall added inline comments.Nov 23 2021, 1:11 PM
clang/lib/Sema/SemaChecking.cpp
208

I can obviously understand why these should be allowed for switch, but here we have a function whose sole purpose is to return the address of a function and I'm wondering why someone would want pass anything more complicated to it.

The idea is that you might have a complicated way of picking which function you mean. I agree that this should probably disallow non-zero offsets.

Perhaps I'm just not that familiar with the C++ use cases here. Would you be able to provide me an example I could use as a test case?

A simple example: fn after constexpr void (*fn)() = &foo;.

Changed the code to evaluate the argument as a constant expression.

clang/lib/Sema/SemaChecking.cpp
208

Perhaps I'm just not that familiar with the C++ use cases here. Would you be able to provide me an example I could use as a test case?

A simple example: fn after constexpr void (*fn)() = &foo;.

Thanks for the example. So, something like this should also work?

$ cat test.cc 
void a() {}
constexpr void (*fn)() = &a;
const void *p = __builtin_function_start(fn);

Looking at what happens after we evaluate the expression, APValue is an l-value containing a VarDecl:

VarDecl 0xe3427a8 <test.cc:2:1, col:27> col:18 referenced fn 'void (*const)()' constexpr cinit

Which means I have to look at VarDecl::getEvaluatedValue if I want to get the function declaration. Or should the evaluation in this case produce a FunctionDecl directly?

Your builtin is using custom type-checking (t), which suppresses all the normal conversions that happen on expressions. Specifically, it skips lvalue-to-rvalue conversion, so in this example the argument ends up being an l-value reference to a variable rather than an r-value loaded from that variable. In addition to confusing constant evaluation, it would also make this look like an ODR-use of the variable, which would be subtly wrong in some C++ cases. The fix is to explicitly request the standard l-value conversions, which you can do with code like:

ExprResult Arg = S.DefaultFunctionArrayLvalueConversion(TheCall->getArg(0));
if (Arg.isInvalid()) return true;
TheCall->setArg(0, Arg.get());

After that, constant-evaluating the argument expression in your example should give you a FunctionDecl* as expected.

samitolvanen planned changes to this revision.Nov 24 2021, 10:07 AM

Your builtin is using custom type-checking (t), which suppresses all the normal conversions that happen on expressions. Specifically, it skips lvalue-to-rvalue conversion, so in this example the argument ends up being an l-value reference to a variable rather than an r-value loaded from that variable.

OK, that explains it. Thanks for the explanation!

Use standard l-value conversions, and add a test case for constexpr.

Thanks, this is exactly what I was looking for. Just some straightforward style/design requests from here.

clang/docs/LanguageExtensions.rst
2527

Please add:

  • an example
  • the fact that the return type is void*
  • the fact that the builtin might not be supported on every target
  • an explanation that it is not safe to call the returned pointer
clang/include/clang/AST/Expr.h
3117

I like that there's a common function to resolve this, but since the behavior is so tied to this one builtin, I'm anxious about someone thinking that it's a more general convenience routine.

Could you move it to Expr and give it a signature like ValueDecl *getAsBuiltinConstantDeclRef()? And then make the comment something like:

/// If this expression is an unambiguous reference to a single declaration,
/// in the style of __builtin_function_start, return that declaration.  Note that
/// this may return a non-static member function or field in C++ if this
/// expression is a member pointer constant.

Then it's a little more general and less tied to CallExpr, and you can just cast the result to FunctionDecl in your calls, which isn't too onerous.

clang/lib/AST/ExprConstant.cpp
1963

This looks great, thanks.

clang/lib/CodeGen/CodeGenModule.h
885

Please make this take the ValueDecl* so that the details of extracting the argument are localized to the builtin-emission code.

clang/lib/Sema/SemaChecking.cpp
206

Minor style nit: I generally prefer to have these checks that are just self-evidently bailing out if something failed appear right after the call without any intervening lines. Readers can immediately see the connection that way.

samitolvanen marked 5 inline comments as done.

Addressed comments.

rjmccall accepted this revision.Dec 2 2021, 8:20 PM

This looks great, thanks. Please feel free to commit with the requested minor change to the docs.

clang/docs/LanguageExtensions.rst
2555

Please take this last sentence and make it the first sentence of the next paragraph; I think it leads into that example very nicely.

pcc accepted this revision.Dec 15 2021, 3:30 PM

LGTM

This revision is now accepted and ready to land.Dec 15 2021, 3:30 PM
This revision was automatically updated to reflect the committed changes.