This is a follow up on https://reviews.llvm.org/D61634
This patch is simpler and only adds the no_builtin attribute.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 39769 Build 39812: arc lint + arc unit
Event Timeline
Thank you for working on this!
It looks like you're missing all of the sema tests that check that the attribute only appertains to functions, accepts the proper kind of arguments, etc.
clang/include/clang/Basic/Attr.td | ||
---|---|---|
3426 | The documentation should explain what this is. | |
clang/lib/CodeGen/CGCall.cpp | ||
1853 | const auto -> bool (and drop the top-level const). | |
1859 | const auto & -> StringRef? | |
1861–1862 | I think this checking should happen in Sema rather than CodeGen. | |
clang/lib/Sema/SemaDeclAttr.cpp | ||
1075–1078 | I think that this should be done in a merge function; there are plenty of examples of how this is typically done elsewhere in the file. | |
1080–1081 | Given that the user has to provide the parens for the attribute anyway, I think this situation should be diagnosed rather than defaulting to the wildcard. It helps catch think-os where the user put in parens and forgot to add the parameter in the first place (the wildcard is not onerous to spell out). | |
1091 | super set -> superset | |
1092 | This silently changes the behavior from what the user might expect, which seems bad. For instance, it would be very reasonable for a user to write [[clang::no_builtin("__builtin_mem*")]] expecting that to behave as a regex pattern, but instead it silently turns into a "disable all builtins". I think it makes sense to diagnose unexpected input like that rather than silently accept it under perhaps unexpected semantics. It may also make sense to support regex patterns in the strings or at least keep the design such that we can add that functionality later without breaking users. | |
1097 | Please don't use auto here as the type is not spelled out in the initialization. | |
1100–1101 | I think this needs to happen in a merge function. |
@aaron.ballman thx a lot for the review. I really appreciate it, especially because I'm not too familiar with this part of the codebase.
clang/lib/Sema/SemaDeclAttr.cpp | ||
---|---|---|
1092 | The code looks for a function name that is exactly * and not "a function name that contains *", it would not turn [[clang::no_builtin("__builtin_mem*")]] into [[clang::no_builtin("*")]]. |
@tejohnson I believe this is the missing part for D67923.
I'm unsure if we still need the BitVector at all in the TLI since it could be a simple attribute lookup on the function. Do you see any problematic interactions with the inlining phase?
Thanks, yep I will take a closer look at the patch today.
I'm unsure if we still need the BitVector at all in the TLI since it could be a simple attribute lookup on the function.
If we didn't save the info on the TLI we would instead need to save the Function object in the TLI and query the attribute info off the Function on every lookup, which seems heavier weight. I think caching the info in that object for fast lookup is going to be better. And as noted in the comments there, we can replace it with the existing AvailableArray moved from the base Impl object into the TLI and remove the override bitvector once this goes in, we use these attributes to set the TLI info on construction, and we remove the clang code that sets the unavailability from the CodeGenOpts which will no longer be needed. If this patch goes in first I can just modify my TLI patch to do that all in one go. Maybe that is best...
Do you see any problematic interactions with the inlining phase?
The inliner will need to be modified to respect the function attributes.
clang/lib/Sema/SemaDeclAttr.cpp | ||
---|---|---|
1072 | You're missing a call to this function within mergeDeclAttribute() in SemaDecl.cpp. | |
1078–1079 | Instead of doing hasAttr<> followed by getAttr<>, this should be: if (const auto *NBA = D->getAttr<NoBuiltinAttr>()) { for (StringRef FunctionName : NBA->functionNames()) ... } But are you able to use llvm::copy() instead of a manual loop? | |
1082–1083 | Same question here about using llvm::copy(). | |
1086–1089 | Rather than walking the entire set like this, would it make more sense to look for the wildcard in the above loop before inserting the name, and set a local variable if found, so that you can do the clear without having to rescan the entire list? | |
1092 | Ah, I missed that -- great! | |
1098–1099 | Just making sure I understand the semantics you want: redeclarations do not have to have matching lists of builtins, but instead the definition will use a merged list? e.g., [[clang::no_builtin("memset")]] void whatever(); [[clang::no_builtin("memcpy")]] void whatever(); [[clang::no_builtin("memmove")]] void whatever() { // Will not use memset, memcpy, or memmove builtins. } That seems a bit strange, to me. In fact, being able to apply this attribute to a declaration seems a bit like a mistake -- this only impacts the definition of the function, and I can't imagine good things coming from hypothetical code like: [[clang::no_builtin("memset")]] void whatever(); #include "whatever.h" // Provides a library declaration of whatever() with no restrictions on it WDYT about restricting this attribute to only appear on definitions? | |
clang/test/Sema/no-builtin.c | ||
1 ↗ | (On Diff #222163) | This should not be emitting LLVM -- it should be -fsyntax-only, however. |
9 ↗ | (On Diff #222163) | We'll need more tests for redeclarations or declaration vs definition once we decide how we want to handle that. You should also add some positive tests that show the attribute applying as expected. |
thx @aaron.ballman , I'm waiting for your reply before uploading the new patch (some of my comments won't have the accompanying code update sorry)
clang/lib/Sema/SemaDeclAttr.cpp | ||
---|---|---|
1072 | Thx, I rearranged the signature a bit, do you happen to know how mergeDeclAttribute is tested? | |
1078–1079 | I had to use a vector instead of a set and uniquify by hand. | |
1086–1089 | This is is conflict with the llvm::copy suggestion above. Which one do you prefer? | |
1098–1099 | That's a very good point. Thx for noticing. I've tried to to use FunctionDecl->hasBody() during attribute handling in the Sema phase but it seems like the FunctionDecl is not complete at this point. Do you have any recommendations on where to perform the check? |
clang/lib/Sema/SemaDeclAttr.cpp | ||
---|---|---|
1098–1099 | So after some investigations it turns out that FunctionDecl::isThisDeclarationADefinition is buggy (returns always false) when called from ProcessDeclAttribute. As a matter of fact all code using this function in ProcessDeclAttribute is dead code (see D68379 which disables the dead code, tests are still passing) I'm unsure of how to fix this, it may be possible of using FunctionDecl::setWillHaveBodyin this switch. |
clang/lib/Sema/SemaDeclAttr.cpp | ||
---|---|---|
1072 | Through redeclarations, e.g., [[some_attr]] void func(); [[some_other_attr]] void func(); [[a_third_attr, some_attr]] void func() { } | |
1086–1089 | Walking the list and not calling llvm::copy. | |
1098–1099 |
That is a bit odd to me; we call it in a half dozen places in SemaDeclAttr.cpp, all of which get called from ProcessDeclAttribute. Are ALL of those uses broken currently?!
You got four of the six. What about the use in handleObjCSuppresProtocolAttr() and the second use in handleAliasAttr()?
I'm also unsure of a good way forward. @rsmith may have ideas too. |
clang/lib/Sema/SemaDeclAttr.cpp | ||
---|---|---|
1098–1099 |
It really is FunctionDecl::isThisDeclarationADefinition that is buggy, the other places where we call isThisDeclarationADefinition in SemaDeclAttr.cpp are ok, i.e. VarDecl::isThisDeclarationADefinition and ObjCProtocolDecl::isThisDeclarationADefinition I tried to use FunctionDecl::setWillHaveBodybut it broke many tests so it seems inappropriate. |
- Address aaron ballman comments
- Checks function name validity and errors when passed 0 argument.
- Update documentation and rebase
- Rewrote mergeNoBuiltinAttr
- Address comments
clang/include/clang/Basic/Attr.td | ||
---|---|---|
3426 | [nit] Should this be BuiltinNames ? builtins are not necessarily functions. | |
clang/include/clang/Basic/AttrDocs.td | ||
4409 | I would phrase it as The compiler is not allowed to add any builtin to foo's body. This covers inserting as well as replacing code with builtins. |
Sorry for the delay, I was traveling for last week.
clang/include/clang/Basic/AttrDocs.td | ||
---|---|---|
4402 | I'd appreciate a blurb in here to the effect: The attribute may also be applied to a virtual function but has no effect on the behavior of overriding functions in a derived class. | |
clang/include/clang/Basic/DiagnosticSemaKinds.td | ||
3607 | I think this should be a warning rather than an error. Imagine the situation where the user uses Clang 11 to write their code and they disable a Clang 11-specific builtin from being used, but then they try to compile the code in Clang 10 which doesn't know about the builtin. Rather than err, it seems reasonable to warn about the unknown builtin name (under a new warning flag group under -Wattributes) and then just ignore the attribute. Because the builtin is unknown anyway, we won't transform any constructs to use it. | |
clang/lib/Sema/SemaDecl.cpp | ||
9556 | handleNoBuiltin -> handleNoBuiltinAttr | |
clang/lib/Sema/SemaDeclAttr.cpp | ||
1072 | Sorry for the code churn, but we realized we wanted this only on definitions after we started down the declaration merging path. I think we do not need mergeNoBuiltinAttr() any longer and can remove most of this logic entirely -- you cannot re-define functions, so we don't have any attributes to merge together. e.g., we no longer have this situation: void whatever() __attribute__((no_builtin("memcpy"))); void whatever() __attribute__((no_builtin("memmove"))) {} // Should ignore both memcpy and memmove I think the only part that needs to move over to handleNoBuiltinAttr() is the part diagnosing a combination of wildcard and non-wildcard arguments. As a thought for a future patch (not requesting this be done now, unless you want to), should we fix-it (or possibly support) this situation? void whatever() __attribute__((no_builtin("memcpy, memmove"))) {} // Note that it's only one string literal | |
1103 | Can you rename this to handleNoBuiltinAttr for consistency? | |
clang/test/CodeGen/no-builtin.c | ||
20 ↗ | (On Diff #224286) | I'd like to see a codegen case that ensures we don't do bad things with virtual function overrides: struct S { virtual void foo() __attribute__((no_builtin("memcpy"))) {} }; struct D : S { void foo() __attribute__((no_builtin("memmove"))) {} }; We should ensure that S::foo() only prohibits memcpy and D::foo() only prohibits memmove. |
clang/test/Sema/no-builtin.c | ||
26 ↗ | (On Diff #224286) | You should probably add another test case for this situation: void whatever() __attribute__((no_builtin("*", "*"))) {} and for member functions in C++ as well: struct S { void whatever() __attribute__((no_builtin("memcpy"))) {} virtual void intentional() __attribute__((no_builtin("memcpy"))) {} }; |
1 ↗ | (On Diff #222163) | This still has -S -emit-llvm -o -; that should be removed, no? |
clang/lib/Sema/SemaDecl.cpp | ||
---|---|---|
9556–9558 | I am not convinced that this is a bug -- the function declaration does not become a definition until the parser reaches the definition. In any case, I don't think the predicate you want is "is this declaration a definition?". Rather, I think what you want is, "does this declaration introduce an explicit function body?". In particular, we should not permit uses of this attribute on defaulted or deleted functions, nor on functions that have a definition by virtue of using __attribute__((alias)). So it probably should be a syntactic check on the form of the declarator (that is, the check you're perrforming here), and the check should probably be D.getFunctionDefinitionKind() == FDK_Definition. (A custom diagnostic for using the attribute on a defaulted or deleted function would be nice too, since the existing diagnostic text isn't really accurate in those cases.) | |
clang/test/Sema/no-builtin.c | ||
7 ↗ | (On Diff #224286) | I find this surprising. I would expect this to work, and to mean the same as -fno-builtin for the function (that is, same as the * form). I think might be a better design to remove the magical "*" form entirely, and use an empty argument list to mean "all builtins". Have you considered that? (That would better mirror how -fno-builtin vs -fno-builtin-foo works.) |
clang/lib/Sema/SemaDecl.cpp | ||
---|---|---|
9556–9558 |
Deleted functions, sure. Defaulted functions... not so sure. I could sort of imagine wanting to instruct a defaulted assignment operator that does memberwise copy that it's not allowed to use a builtin memcpy, for instance. Or is this a bad idea for some reason I'm not thinking of? | |
clang/test/Sema/no-builtin.c | ||
7 ↗ | (On Diff #224286) |
I like this approach. |
clang/lib/Sema/SemaDecl.cpp | ||
---|---|---|
9556–9558 | -fno-builtin does not turn off using llvm.memcpy for copying memory, and it doesn't turn off llvm.memcpy being lowered to a call to memcpy. Allowing this for defaulted functions would only give a false sense of security, at least for now (though I could imagine we might find some way to change that in the future). Also, trivial defaulted functions (where we're most likely to end up with memcpy calls) are often not emitted at all, instead being directly inlined by the frontend, so there's nowhere to attach a no-builtin-memcpy LLVM attribute (we'd need to put the attribute on all callers of those functions) even if LLVM learned to not emit calls to memcpy to implement llvm.memcpy when operating under a no-builtin-memcpy constraint. |
clang/lib/Sema/SemaDecl.cpp | ||
---|---|---|
9556–9558 | Ah, good to know! We're in agreement on how this should proceed, thank you for the insights. |
clang/include/clang/Basic/DiagnosticSemaKinds.td | ||
---|---|---|
3607 | Makes perfect sense, thx for pointing this out. | |
clang/lib/Sema/SemaDecl.cpp | ||
9556–9558 |
@rsmith It may not be a bug but it is currently used in a buggy manner (See D68379). | |
clang/test/Sema/no-builtin.c | ||
26 ↗ | (On Diff #224286) | void whatever() __attribute__((no_builtin("*", "*"))) {} I've added a few ones. struct S { void whatever() __attribute__((no_builtin("memcpy"))) {} virtual void intentional() __attribute__((no_builtin("memcpy"))) {} }; What do you want me to test for this one? |
clang/include/clang/Basic/Attr.td | ||
---|---|---|
3424 | I think we do not want this to be an inheritable attribute, but just Attr instead, because this can only be written on definitions (so there's no way to inherit the attribute from previous declarations). | |
clang/test/Sema/no-builtin.c | ||
26 ↗ | (On Diff #224286) |
Ensuring that applying the attribute to member function definitions, including virtual ones, is supported and doesn't cause diagnostics. The codegen side of things will test that overridden virtual methods do not inherit the attribute. Actually, there's another interesting test hiding in there for when we do want a diagnostic (I think): struct S { void whatever() __attribute__((no_builtin("memcpy"))); // Should diagnose as not a definition }; |
clang/include/clang/Basic/AttrDocs.td | ||
---|---|---|
4407 | This mention of * is now out of date. | |
4412 | This one too. | |
clang/include/clang/Basic/DiagnosticSemaKinds.td | ||
3609 | Hmm, I thought there was a way you could do: InGroup<DiagGroup<"invalid-no-builtin-names", IgnoredAttributes>>; but it seems I've got the order reversed and you'd have to add a new diagnostic group that IgnoredAttributes could subsume. We don't do that for other attributes, so I think you should just do: InGroup<DiagGroup<"invalid-no-builtin-names">>; (Feel free to pick a better diagnostic group name if you have a better idea, I'm not tied to my suggestion.) |
clang/include/clang/Basic/DiagnosticSemaKinds.td | ||
---|---|---|
3609 | "invalid-no-builtin-names" LGTM |
I think we do not want this to be an inheritable attribute, but just Attr instead, because this can only be written on definitions (so there's no way to inherit the attribute from previous declarations).