This is an archive of the discontinued LLVM Phabricator instance.

[clang] Add no_builtin attribute
ClosedPublic

Authored by gchatelet on Sep 25 2019, 7:37 AM.

Event Timeline

gchatelet created this revision.Sep 25 2019, 7:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2019, 7:37 AM
aaron.ballman added a subscriber: aaron.ballman.

Thank you for working on this!

It looks like you're missing all of the sema tests that check that the attribute only appertains to functions, accepts the proper kind of arguments, etc.

clang/include/clang/Basic/Attr.td
3398

The documentation should explain what this is.

clang/lib/CodeGen/CGCall.cpp
1853

const auto -> bool (and drop the top-level const).

1859

const auto & -> StringRef?

1861–1862

I think this checking should happen in Sema rather than CodeGen.

clang/lib/Sema/SemaDeclAttr.cpp
1075–1078

I think that this should be done in a merge function; there are plenty of examples of how this is typically done elsewhere in the file.

1080–1081

Given that the user has to provide the parens for the attribute anyway, I think this situation should be diagnosed rather than defaulting to the wildcard. It helps catch think-os where the user put in parens and forgot to add the parameter in the first place (the wildcard is not onerous to spell out).

1091

super set -> superset

1092

This silently changes the behavior from what the user might expect, which seems bad. For instance, it would be very reasonable for a user to write [[clang::no_builtin("__builtin_mem*")]] expecting that to behave as a regex pattern, but instead it silently turns into a "disable all builtins".

I think it makes sense to diagnose unexpected input like that rather than silently accept it under perhaps unexpected semantics. It may also make sense to support regex patterns in the strings or at least keep the design such that we can add that functionality later without breaking users.

1097

Please don't use auto here as the type is not spelled out in the initialization.

1100–1101

I think this needs to happen in a merge function.

gchatelet updated this revision to Diff 221924.Sep 26 2019, 5:06 AM
gchatelet marked 11 inline comments as done.
  • Address aaron ballman comments

@aaron.ballman thx a lot for the review. I really appreciate it, especially because I'm not too familiar with this part of the codebase.

clang/lib/Sema/SemaDeclAttr.cpp
1092

The code looks for a function name that is exactly * and not "a function name that contains *", it would not turn [[clang::no_builtin("__builtin_mem*")]] into [[clang::no_builtin("*")]].
That said, I fully agree that an error should be returned if the function name is not valid.
I'll work on this.

gchatelet updated this revision to Diff 221939.Sep 26 2019, 7:06 AM
  • Checks function name validity and errors when passed 0 argument.

@tejohnson I believe this is the missing part for D67923.

I'm unsure if we still need the BitVector at all in the TLI since it could be a simple attribute lookup on the function. Do you see any problematic interactions with the inlining phase?

gchatelet updated this revision to Diff 222163.Sep 27 2019, 6:31 AM
  • Update documentation and rebase

@tejohnson I believe this is the missing part for D67923.

Thanks, yep I will take a closer look at the patch today.

I'm unsure if we still need the BitVector at all in the TLI since it could be a simple attribute lookup on the function.

If we didn't save the info on the TLI we would instead need to save the Function object in the TLI and query the attribute info off the Function on every lookup, which seems heavier weight. I think caching the info in that object for fast lookup is going to be better. And as noted in the comments there, we can replace it with the existing AvailableArray moved from the base Impl object into the TLI and remove the override bitvector once this goes in, we use these attributes to set the TLI info on construction, and we remove the clang code that sets the unavailability from the CodeGenOpts which will no longer be needed. If this patch goes in first I can just modify my TLI patch to do that all in one go. Maybe that is best...

Do you see any problematic interactions with the inlining phase?

The inliner will need to be modified to respect the function attributes.

aaron.ballman marked an inline comment as done.Sep 27 2019, 9:38 AM
aaron.ballman added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
1072

You're missing a call to this function within mergeDeclAttribute() in SemaDecl.cpp.

1078–1079

Instead of doing hasAttr<> followed by getAttr<>, this should be:

if (const auto *NBA = D->getAttr<NoBuiltinAttr>()) {
  for (StringRef FunctionName : NBA->functionNames())
    ...
}

But are you able to use llvm::copy() instead of a manual loop?

1082–1083

Same question here about using llvm::copy().

1086–1089

Rather than walking the entire set like this, would it make more sense to look for the wildcard in the above loop before inserting the name, and set a local variable if found, so that you can do the clear without having to rescan the entire list?

1092

Ah, I missed that -- great!

1098–1099

Just making sure I understand the semantics you want: redeclarations do not have to have matching lists of builtins, but instead the definition will use a merged list? e.g.,

[[clang::no_builtin("memset")]] void whatever();
[[clang::no_builtin("memcpy")]] void whatever();

[[clang::no_builtin("memmove")]] void whatever() {
 // Will not use memset, memcpy, or memmove builtins.
}

That seems a bit strange, to me. In fact, being able to apply this attribute to a declaration seems a bit like a mistake -- this only impacts the definition of the function, and I can't imagine good things coming from hypothetical code like:

[[clang::no_builtin("memset")]] void whatever();
#include "whatever.h" // Provides a library declaration of whatever() with no restrictions on it

WDYT about restricting this attribute to only appear on definitions?

clang/test/Sema/no-builtin.c
2

This should not be emitting LLVM -- it should be -fsyntax-only, however.

10

We'll need more tests for redeclarations or declaration vs definition once we decide how we want to handle that.

You should also add some positive tests that show the attribute applying as expected.

gchatelet marked 7 inline comments as done.Oct 1 2019, 5:20 AM

thx @aaron.ballman , I'm waiting for your reply before uploading the new patch (some of my comments won't have the accompanying code update sorry)

clang/lib/Sema/SemaDeclAttr.cpp
1072

Thx, I rearranged the signature a bit, do you happen to know how mergeDeclAttribute is tested?

1078–1079

I had to use a vector instead of a set and uniquify by hand.

1086–1089

This is is conflict with the llvm::copy suggestion above. Which one do you prefer?

1098–1099

That's a very good point. Thx for noticing.
Indeed I think it only makes sense to have the attribute on the function definition.

I've tried to to use FunctionDecl->hasBody() during attribute handling in the Sema phase but it seems like the FunctionDecl is not complete at this point.
All calls to hasBody() return false, if I repeat the operation in CGCall then hasBody returns true and I can see the CompoundStatement.

Do you have any recommendations on where to perform the check?

gchatelet marked 3 inline comments as done.Oct 3 2019, 2:29 AM
gchatelet added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
1098–1099

So after some investigations it turns out that FunctionDecl::isThisDeclarationADefinition is buggy (returns always false) when called from ProcessDeclAttribute.
I believe it's because the CompoundStatement is not parsed at this point.

As a matter of fact all code using this function in ProcessDeclAttribute is dead code (see D68379 which disables the dead code, tests are still passing)

I'm unsure of how to fix this, it may be possible of using FunctionDecl::setWillHaveBodyin this switch.

aaron.ballman added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
1072

Through redeclarations, e.g.,

[[some_attr]] void func();
[[some_other_attr]] void func();

[[a_third_attr, some_attr]] void func() {
}
1086–1089

Walking the list and not calling llvm::copy.

1098–1099

So after some investigations it turns out that FunctionDecl::isThisDeclarationADefinition is buggy (returns always false) when called from ProcessDeclAttribute.

That is a bit odd to me; we call it in a half dozen places in SemaDeclAttr.cpp, all of which get called from ProcessDeclAttribute. Are ALL of those uses broken currently?!

As a matter of fact all code using this function in ProcessDeclAttribute is dead code (see D68379 which disables the dead code, tests are still passing)

You got four of the six. What about the use in handleObjCSuppresProtocolAttr() and the second use in handleAliasAttr()?

I'm unsure of how to fix this, it may be possible of using FunctionDecl::setWillHaveBodyin this switch.

I'm also unsure of a good way forward. @rsmith may have ideas too.

gchatelet marked 2 inline comments as done.Oct 8 2019, 1:59 AM
gchatelet added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
1098–1099

That is a bit odd to me; we call it in a half dozen places in SemaDeclAttr.cpp, all of which get called from ProcessDeclAttribute. Are ALL of those uses broken currently?!
You got four of the six. What about the use in handleObjCSuppresProtocolAttr() and the second use in handleAliasAttr()?

It really is FunctionDecl::isThisDeclarationADefinition that is buggy, the other places where we call isThisDeclarationADefinition in SemaDeclAttr.cpp are ok, i.e. VarDecl::isThisDeclarationADefinition and ObjCProtocolDecl::isThisDeclarationADefinition

I tried to use FunctionDecl::setWillHaveBodybut it broke many tests so it seems inappropriate.

gchatelet marked an inline comment as done.Oct 9 2019, 6:08 AM
gchatelet added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
1098–1099

I've tried to fix it in D68701. I'm pretty sure this is not the right way to do it though,
@rsmith any idea on how to proceed?

In the meantime I'll add a FIXME and move on with this patch if you don't mind.

gchatelet marked 6 inline comments as done.Oct 9 2019, 9:33 AM
gchatelet updated this revision to Diff 224082.Oct 9 2019, 9:33 AM
  • Address aaron ballman comments
  • Checks function name validity and errors when passed 0 argument.
  • Update documentation and rebase
  • Rewrote mergeNoBuiltinAttr
  • Address comments
gchatelet updated this revision to Diff 224085.Oct 9 2019, 9:37 AM
  • Remove leftovers
Harbormaster completed remote builds in B39245: Diff 224085.

It should be ready to submit, @aaron.ballman @courbet can you take a final look?

courbet added inline comments.Oct 10 2019, 1:39 AM
clang/include/clang/Basic/Attr.td
3398

[nit] Should this be BuiltinNames ? builtins are not necessarily functions.

clang/include/clang/Basic/AttrDocs.td
4409

I would phrase it as The compiler is not allowed to add any builtin to foo's body. This covers inserting as well as replacing code with builtins.

gchatelet updated this revision to Diff 224285.Oct 10 2019, 2:26 AM
gchatelet marked 2 inline comments as done.
  • Address comments
gchatelet updated this revision to Diff 224286.Oct 10 2019, 2:30 AM
  • Fix missing rename
Harbormaster completed remote builds in B39306: Diff 224286.

A gentle ping @aaron.ballman

Sorry for the delay, I was traveling for last week.

clang/include/clang/Basic/AttrDocs.td
4402

I'd appreciate a blurb in here to the effect: The attribute may also be applied to a virtual function but has no effect on the behavior of overriding functions in a derived class.

clang/include/clang/Basic/DiagnosticSemaKinds.td
3600 ↗(On Diff #224286)

I think this should be a warning rather than an error.

Imagine the situation where the user uses Clang 11 to write their code and they disable a Clang 11-specific builtin from being used, but then they try to compile the code in Clang 10 which doesn't know about the builtin.

Rather than err, it seems reasonable to warn about the unknown builtin name (under a new warning flag group under -Wattributes) and then just ignore the attribute. Because the builtin is unknown anyway, we won't transform any constructs to use it.

clang/lib/Sema/SemaDecl.cpp
9508 ↗(On Diff #224286)

handleNoBuiltin -> handleNoBuiltinAttr

clang/lib/Sema/SemaDeclAttr.cpp
1072

Sorry for the code churn, but we realized we wanted this only on definitions after we started down the declaration merging path. I think we do not need mergeNoBuiltinAttr() any longer and can remove most of this logic entirely -- you cannot re-define functions, so we don't have any attributes to merge together. e.g., we no longer have this situation:

void whatever() __attribute__((no_builtin("memcpy")));
void whatever() __attribute__((no_builtin("memmove"))) {} // Should ignore both memcpy and memmove

I think the only part that needs to move over to handleNoBuiltinAttr() is the part diagnosing a combination of wildcard and non-wildcard arguments.

As a thought for a future patch (not requesting this be done now, unless you want to), should we fix-it (or possibly support) this situation?

void whatever() __attribute__((no_builtin("memcpy, memmove"))) {} // Note that it's only one string literal
1103

Can you rename this to handleNoBuiltinAttr for consistency?

clang/test/CodeGen/no-builtin.c
21

I'd like to see a codegen case that ensures we don't do bad things with virtual function overrides:

struct S {
  virtual void foo() __attribute__((no_builtin("memcpy"))) {}
};

struct D : S {
  void foo() __attribute__((no_builtin("memmove"))) {}
};

We should ensure that S::foo() only prohibits memcpy and D::foo() only prohibits memmove.

clang/test/Sema/no-builtin.c
2

This still has -S -emit-llvm -o -; that should be removed, no?

27

You should probably add another test case for this situation:

void whatever() __attribute__((no_builtin("*", "*"))) {}

and for member functions in C++ as well:

struct S {
  void whatever() __attribute__((no_builtin("memcpy"))) {}
  virtual void intentional() __attribute__((no_builtin("memcpy"))) {}
};
rsmith added inline comments.Oct 16 2019, 2:30 PM
clang/lib/Sema/SemaDecl.cpp
9508–9510 ↗(On Diff #224286)

I am not convinced that this is a bug -- the function declaration does not become a definition until the parser reaches the definition.

In any case, I don't think the predicate you want is "is this declaration a definition?". Rather, I think what you want is, "does this declaration introduce an explicit function body?". In particular, we should not permit uses of this attribute on defaulted or deleted functions, nor on functions that have a definition by virtue of using __attribute__((alias)). So it probably should be a syntactic check on the form of the declarator (that is, the check you're perrforming here), and the check should probably be D.getFunctionDefinitionKind() == FDK_Definition. (A custom diagnostic for using the attribute on a defaulted or deleted function would be nice too, since the existing diagnostic text isn't really accurate in those cases.)

clang/test/Sema/no-builtin.c
8

I find this surprising. I would expect this to work, and to mean the same as -fno-builtin for the function (that is, same as the * form).

I think might be a better design to remove the magical "*" form entirely, and use an empty argument list to mean "all builtins". Have you considered that? (That would better mirror how -fno-builtin vs -fno-builtin-foo works.)

aaron.ballman added inline comments.Oct 16 2019, 2:44 PM
clang/lib/Sema/SemaDecl.cpp
9508–9510 ↗(On Diff #224286)

In particular, we should not permit uses of this attribute on defaulted or deleted functions

Deleted functions, sure. Defaulted functions... not so sure. I could sort of imagine wanting to instruct a defaulted assignment operator that does memberwise copy that it's not allowed to use a builtin memcpy, for instance. Or is this a bad idea for some reason I'm not thinking of?

clang/test/Sema/no-builtin.c
8

I think might be a better design to remove the magical "*" form entirely, and use an empty argument list to mean "all builtins".

I like this approach.

rsmith added inline comments.Oct 16 2019, 3:34 PM
clang/lib/Sema/SemaDecl.cpp
9508–9510 ↗(On Diff #224286)

-fno-builtin does not turn off using llvm.memcpy for copying memory, and it doesn't turn off llvm.memcpy being lowered to a call to memcpy. Allowing this for defaulted functions would only give a false sense of security, at least for now (though I could imagine we might find some way to change that in the future).

Also, trivial defaulted functions (where we're most likely to end up with memcpy calls) are often not emitted at all, instead being directly inlined by the frontend, so there's nowhere to attach a no-builtin-memcpy LLVM attribute (we'd need to put the attribute on all callers of those functions) even if LLVM learned to not emit calls to memcpy to implement llvm.memcpy when operating under a no-builtin-memcpy constraint.

aaron.ballman added inline comments.Oct 17 2019, 6:10 AM
clang/lib/Sema/SemaDecl.cpp
9508–9510 ↗(On Diff #224286)

Ah, good to know! We're in agreement on how this should proceed, thank you for the insights.

gchatelet updated this revision to Diff 225435.Oct 17 2019, 8:04 AM
gchatelet marked 13 inline comments as done.
  • Address comments
  • Fix missing rename
  • Address comments
  • Add warning category
gchatelet added inline comments.Oct 17 2019, 8:04 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
3600 ↗(On Diff #224286)

Makes perfect sense, thx for pointing this out.

clang/lib/Sema/SemaDecl.cpp
9508–9510 ↗(On Diff #224286)

I am not convinced that this is a bug -- the function declaration does not become a definition until the parser reaches the definition.

@rsmith It may not be a bug but it is currently used in a buggy manner (See D68379).
An assert to prevent misuse would be welcome IMHO, I've added a note on the function meanwhile.

clang/test/Sema/no-builtin.c
27
void whatever() __attribute__((no_builtin("*", "*"))) {}

I've added a few ones.

struct S {
  void whatever() __attribute__((no_builtin("memcpy"))) {}
  virtual void intentional() __attribute__((no_builtin("memcpy"))) {}
};

What do you want me to test for this one?

aaron.ballman added inline comments.Oct 17 2019, 8:18 AM
clang/include/clang/Basic/Attr.td
3396

I think we do not want this to be an inheritable attribute, but just Attr instead, because this can only be written on definitions (so there's no way to inherit the attribute from previous declarations).

clang/test/Sema/no-builtin.c
27

What do you want me to test for this one?

Ensuring that applying the attribute to member function definitions, including virtual ones, is supported and doesn't cause diagnostics. The codegen side of things will test that overridden virtual methods do not inherit the attribute.

Actually, there's another interesting test hiding in there for when we do want a diagnostic (I think):

struct S {
  void whatever() __attribute__((no_builtin("memcpy"))); // Should diagnose as not a definition
};
gchatelet updated this revision to Diff 225575.Oct 18 2019, 1:59 AM
gchatelet marked 3 inline comments as done.
  • Change NoBuiltin from InheritableAttr to Attr
  • Improve tests
gchatelet updated this revision to Diff 225576.Oct 18 2019, 2:09 AM
gchatelet marked an inline comment as done.
  • Improve tests
gchatelet updated this revision to Diff 225578.Oct 18 2019, 2:18 AM
  • Reverting whole file formatting
gchatelet updated this revision to Diff 225593.Oct 18 2019, 3:58 AM
  • Call sites to virtual functions are not annotated
aaron.ballman added inline comments.Oct 18 2019, 5:50 AM
clang/include/clang/Basic/AttrDocs.td
4407

This mention of * is now out of date.

4412

This one too.

clang/include/clang/Basic/DiagnosticSemaKinds.td
3609 ↗(On Diff #225593)

Hmm, I thought there was a way you could do:

InGroup<DiagGroup<"invalid-no-builtin-names", IgnoredAttributes>>;

but it seems I've got the order reversed and you'd have to add a new diagnostic group that IgnoredAttributes could subsume. We don't do that for other attributes, so I think you should just do:

InGroup<DiagGroup<"invalid-no-builtin-names">>;

(Feel free to pick a better diagnostic group name if you have a better idea, I'm not tied to my suggestion.)

gchatelet updated this revision to Diff 225618.Oct 18 2019, 7:30 AM
  • Fix documentation and Warning category
gchatelet marked 4 inline comments as done.Oct 18 2019, 7:32 AM
gchatelet added inline comments.
clang/include/clang/Basic/DiagnosticSemaKinds.td
3609 ↗(On Diff #225593)

"invalid-no-builtin-names" LGTM

gchatelet marked an inline comment as done.Oct 22 2019, 2:45 AM
This revision is now accepted and ready to land.Oct 28 2019, 8:51 AM

LGTM!

Thx a lot for bearing with me and for your valuable comments !
Really appreciate it!

This revision was automatically updated to reflect the committed changes.