This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Introduce BuiltinAttr, per-declaration builtin-ness
ClosedPublic

Authored by tambre on Apr 5 2020, 5:08 AM.

Details

Summary

Instead of relying on whether a certain identifier is a builtin, introduce BuiltinAttr to specify a declaration as having builtin semantics.

This fixes incompatible redeclarations of builtins, as reverting the identifier as being builtin due to one incompatible redeclaration would have broken rest of the builtin calls.
Mostly-compatible redeclarations of builtins also no longer have builtin semantics. They don't call the builtin nor inherit their attributes.
A long-standing FIXME regarding builtins inside a namespace enclosed in extern "C" not being recognized is also addressed.

Due to the more correct handling attributes for builtin functions are added in more places, resulting in more useful warnings.
Tests are updated to reflect that.

Intrinsics without an inline definition in intrin.h had inline and static removed as they had no effect and caused them to no longer be recognized as builtins otherwise.

A pthread_create() related test is XFAIL-ed, as it relied on it being recognized as a builtin based on its name.
The builtin declaration syntax is too restrictive and doesn't allow custom structs, function pointers, etc.
It seems to be the only case and fixing this would require reworking the current builtin syntax, so this seems acceptable.

Fixes PR45410.

Diff Detail

Event Timeline

tambre created this revision.Apr 5 2020, 5:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2020, 5:08 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
yutsumi accepted this revision.Apr 7 2020, 1:08 AM

Thank you very much. LGTM.

This revision is now accepted and ready to land.Apr 7 2020, 1:08 AM
tambre added a comment.Apr 9 2020, 1:51 AM

rsmith: ping

rsmith: ping 2

Could someone please review this?

hokein added a subscriber: rjmccall.May 7 2020, 7:42 AM

sorry for the long delay, unfortunately I'm not familiar enough with the code to do a good review.

+cc @rjmccall may have more context (from the git log), could you take a look?

hokein edited reviewers, added: rjmccall; removed: hokein.May 7 2020, 7:43 AM
hokein added a subscriber: hokein.
rjmccall requested changes to this revision.May 7 2020, 12:29 PM
rjmccall added inline comments.
clang/lib/Sema/SemaDecl.cpp
3772

The old code didn't check this eagerly. The Old->isImplicit() check is unlikely to fire; please make sure that we don't do any extra work unless it does.

3776

hasRevertedBuiltin() is global state for the identifier, so this check isn't specifically checking anything about the previous declaration, it's checking whether the identifier was ever used for a builtin. Now, the implicit-ness of the previous declaration *is* declaration-specific, and there are very few ways that we end up with an implicit function declaration. It's quite possible that implicit + identifier-is-or-was-a-builtin is sufficient to say that it's an implicit builtin declaration today. However, I do worry about the fact that this change is losing the is-predefined-library-function part of the existing check, and I think we should continue to check that somehow. If that means changing how we revert builtin-ness, e.g. by continuing to store the old builtinID but just recording that it's been reverted, that seems reasonable.

That said, I think there's a larger problem, which is that you're ignoring half of the point of the comment you've deleted:

Doing this for local extern declarations is problematic. If
the builtin declaration remains visible, a second invalid
local declaration will produce a hard error; if it doesn't
remain visible, a single bogus local redeclaration (which is
// actually only a warning) could break all the downstream code.

The problem with reverting the builtin-ness of the identifier after seeing a bad local declaration is that, after we leave the scope of that function, the global function doesn't go back to being a builtin, which can break the behavior of all the rest of the code.

If we want to handle this kind of bogus local declaration correctly, I think we need to stop relying primarily on the builtin-ness of the identifier — which is global state and therefore inherently problematic — and instead start recording whether a specific declaration has builtin semantics. The most obvious way to do that is to record an implicit BuiltinAttr when implicitly declaring a builtin, inherit that attribute only on compatible redeclarations, and then make builtin queries look for that attribute instead of checking the identifier.

Richard, do you agree?

This revision now requires changes to proceed.May 7 2020, 12:29 PM
tambre updated this revision to Diff 264441.May 16 2020, 8:29 AM

Rework builtin declaration handling. Introduce BuiltinAttr.

tambre retitled this revision from [Sema] Fix incompatible builtin redeclarations in non-global scope to [Sema] Introduce BuiltinAttr, per-declaration builtin-ness.May 16 2020, 8:34 AM
tambre edited the summary of this revision. (Show Details)
tambre updated this revision to Diff 264447.May 16 2020, 10:37 AM
tambre marked 3 inline comments as done.

Fix adding BuiltinAttr in C++ mode. Update one test.

@rsmith This is a big enough architectural change that I'd appreciate your sign-off on the basic approach.

clang/lib/Sema/SemaDecl.cpp
3772

Since BuiltinAttr is inherited, we should keep the isImplicit check, because we only want to use a special diagnostic when "redeclaring" the implicit builtin declaration.

8903

Hmm. I'm concerned about not doing any sort of semantic compatibility check here before we assign the function special semantics. Have we really never done those in C++?

If not, I wonder if we can reasonably make an implicit declaration but just make it hidden from lookup.

tambre updated this revision to Diff 264497.May 17 2020, 8:14 AM
tambre marked an inline comment as not done.

Semantic compatibility checking for C++ builtin redeclarations.
Remove some now unnecessary logic from getBuiltinID().
Update more tests. 4 tests still failing.

tambre edited the summary of this revision. (Show Details)May 17 2020, 8:17 AM
tambre marked 4 inline comments as done.EditedMay 17 2020, 8:48 AM

Thanks for the reviews and design pointers, John!

There are still a few tests failing, but I'd rather not dive in before the general approach is considered acceptable.

clang/lib/Sema/SemaDecl.cpp
8903

Currently there's no semantic compatibility checking for builtin redeclarations. There is for initial declarations though.

I've added this checking by splitting the actual builtin declaration creation off from LazilyCreateBuiltin into CreateBuiltin and checking if the current declaration is compatible with what the builtin's would be.
This results in stronger typechecking than before for builtin declarations, so some incompatible declarations are no longer marked as builtin. See cxx1z-noexcept-function-type.cpp for an example.

clang/lib/Sema/SemaExpr.cpp
6054

rewriteBuiltinFunctionDecl() creates a new function declaration and happened to disregard the builtin's attributes. This caused BuiltinAttr to go missing in a bunch of OpenCL tests, resulting in failures.

clang/test/Sema/warn-fortify-source.c
13–14

Not quite sure what to do here. These were previously recognized as builtins due to their name despite being incompatible and thus had fortify checking similar to the real memcpy.

Maybe:

  1. Introduce a generic version of ArmBuiltinAliasAttr.
  2. Something like FormatAttr.
clang/test/SemaCXX/cxx11-compat.cpp
35

For some reason this printf didn't get format checking before. Same for SemaCXX/warn-unused-local-typedef.cpp.

clang/test/SemaCXX/cxx1z-noexcept-function-type.cpp
120 ↗(On Diff #264497)

This doesn't work anymore since we now properly check builtin declaration compatibility and since C++17 noexcept is part of the function type but builtins aren't noexcept.
Thoughts?

rjmccall added inline comments.May 22 2020, 8:40 PM
clang/lib/Sema/SemaDecl.cpp
8903

That makes sense to me in principle. I'm definitely concerned about noexcept differences causing C library functions to not be treated as builtins, though; that seems stricter than we want. How reasonable is it to weaken this?

clang/test/Sema/warn-fortify-source.c
13–14

That's interesting. It definitely seems wrong to apply builtin logic to a function that doesn't have a compatible low-level signature. My inclination is to disable builtin checking here, but to notify the contributors so that they can figure out an appropriate response.

tambre updated this revision to Diff 265871.May 23 2020, 9:36 AM
tambre marked 6 inline comments as done.

Weakened noexcept checking.

tambre updated this revision to Diff 265872.May 23 2020, 9:57 AM
tambre marked 4 inline comments as done.

Remove memcpy overload tests from warn-fortify-source.c, which relied on identifier-based builtin identification.

tambre added inline comments.May 23 2020, 9:58 AM
clang/lib/Sema/SemaDecl.cpp
8903

I agree having noexcept weakened is reasonable.
I've changed it to create an identical type to the NewFD with the exception spec removed for the comparison. This fixes it.

clang/test/CodeGen/ms-intrinsics.c
23 ↗(On Diff #264497)

__stosb and friends aren't marked as builtin because they're declared as static.
I don't think there's a good reason to have builtins as static and we should simply remove the static specifier from those intrinsics in headers.
Alternatively, we could weaken compatibility checking similar to noexcept.
Thoughts?

clang/test/Sema/warn-fortify-source.c
13–14

Agreed.
I've removed this test, as there doesn't seem to be an easy way to replicate this behaviour.

clang/test/SemaCXX/cxx1z-noexcept-function-type.cpp
120 ↗(On Diff #264497)

Fixed by removing noexcept for the declaration compatibility comparison.

aaron.ballman added inline comments.May 25 2020, 10:42 AM
clang/lib/AST/Decl.cpp
3168–3170

I think this is a bit more clear:

} else if (const auto *A = getAttr<BuiltinAttr>()) {
  BuiltinID = A->getID();
}

and initialize BuiltinID to zero above.

clang/test/Sema/implicit-builtin-decl.c
64

It looks like we're losing test coverage with this change?

tambre updated this revision to Diff 267450.May 30 2020, 3:20 AM
tambre marked 4 inline comments as done.

Fix some intrinsics not being marked as builtin due to being static in the headers.
Make some code easier to read, fix test for sigsetjmp in Sema/implicit-builtin-decl.c to reflect the original intent.

tambre marked an inline comment as done.May 30 2020, 3:27 AM
tambre added inline comments.
clang/lib/AST/Decl.cpp
3168–3170

Done.

clang/test/CodeGen/callback_pthread_create.c
17

As many others prior to this patch, pthread_create was recognized as a builtin due to its name and thus had attributes applied.
Unlike others however, pthread_create is the only builtin in Builtins.def that doesn't have its arguments specified. Doing that would require implementing support for function pointers in the builtin database and adding yet another special case for pthread_t and pthread_attr_t.
That'd be quite a bit of work, which I'm not interested in doing.

How about simply removing the hack that is the pthread_create builtin entry and disabling/removing this test?

clang/test/Sema/implicit-builtin-decl.c
64

Indeed. I've reverted this change and changed the test to instead not test for it being recognized as a builtin, since it shouldn't be.

tambre updated this revision to Diff 288792.Aug 29 2020, 11:20 AM

Rebase, fix new MS builtins and other tests, XFAIL pthread_create() test

tambre edited the summary of this revision. (Show Details)Aug 29 2020, 11:25 AM

Please review. I would like to get this in a mergeable state.

FYI: PR45410, which this fixes, has been marked as a release blocker for 11.0.0.

I'd still like @rsmith to sign off here as code owner.

clang/include/clang/Basic/IdentifierTable.h
231

Do we need to support reverting builtins anymore?

clang/test/CodeGen/callback_pthread_create.c
8

I guess the problem with pthread_create is that the types are not really reasonable to synthesize. I wonder if we can use an approach more like what we do with C++, where we don't magically synthesize a declaration but where we do recognize that a particular declaration is compatible with the builtin signature.

tambre updated this revision to Diff 288911.Aug 31 2020, 3:16 AM
tambre marked 2 inline comments as done.

Remove builtin reverting.

clang/include/clang/Basic/IdentifierTable.h
231

We don't. It'd be possible to set the builtin ID of identifiers to 0 as an optimization to avoid declaration compatibility checking if there's a declaration that would hide the actual builtin. But I doubt it's worth it.

I've removed code related to this, as nothing was actually checking the reverted status anymore.

clang/test/CodeGen/callback_pthread_create.c
8

Seems like a good idea. Would also reduce code duplication between C and C++. I would be willing to look into that as a followup.

rjmccall added inline comments.Aug 31 2020, 2:43 PM
clang/lib/Serialization/ASTWriter.cpp
3387 ↗(On Diff #288911)

Clang bitcode isn't really a stable format, it's fine to just drop this bit.

I'd still like @rsmith to sign off here as code owner.

Generally, I'm happy with this direction.

What happens for builtins with the "t" (custom typechecking) flag, for which the signature is intended to have no meaning? Do we always give them builtin semantics, or never, or ... something else? I think it might be reasonable to require them to always be declared as taking unspecified arguments -- () in C and (...) in C++, or to simply say that the user cannot declare such functions themselves.

clang/lib/Headers/intrin.h
154

Does the __inline__ here do anything for a builtin function? Can we remove it along with the static?

207

Why is static being removed from some of the functions in this header but not others?

clang/lib/Sema/SemaDecl.cpp
9624–9650

I think this needs more refinement:

  • You appear to be creating and throwing away a new builtin function declaration (plus parameter declarations etc) each time you see a declaration with a matching name, even if one was already created. Given that you don't actually use D for anything other than its type, creating the declaration seems redundant and using ASTContext::GetBuiltinType would be more appropriate.
  • There are no checks of which scope the new function is declared in; this appears to apply in all scopes, but some builtin names are only reserved in the global scope (those beginning with an underscore followed by a lowercase letter such as _bittest), so that doesn't seem appropriate. The old code in FunctionDecl::getBuiltinID checked that the declaration is given C language linkage (except for _GetExceptionInfo, which was permitted to have C++ language linkage), and that check still seems appropriate to me.
  • The special case permitting _GetExceptionInfo to be declared with *any* type seems suspect; the old code permitted it to have different language linkage, not the wrong type.
  • Using typesAreCompatible in C++-specific code is weird, since C++ doesn't have a notion of "compatible types".
clang/lib/Serialization/ASTReader.cpp
975 ↗(On Diff #288911)

We don't have any stability guarantees for our AST bitcode format yet; you can just remove this bit rather than retaining a placeholder.

clang/test/Analysis/bstring.cpp
106 ↗(On Diff #288911)

This should not be recognized as a builtin, because the memset function is not extern "C".

tambre updated this revision to Diff 289631.EditedSep 2 2020, 9:15 PM
tambre marked 6 inline comments as done.

Remove __inline__ and static from all builtins in intrin.h.
Remove ASTReader/ASTWriter placeholders.
During builtin recognition:

  • Use ASTContext::GetBuiltinType() instead of creating a new function declaration every time.
  • Check for C linkage.
  • Handle builtins with custom typechecking.
tambre added a comment.Sep 2 2020, 9:17 PM

Thanks for the review!
I believe I've managed to address your comments.

What happens for builtins with the "t" (custom typechecking) flag, for which the signature is intended to have no meaning? Do we always give them builtin semantics, or never, or ... something else? I think it might be reasonable to require them to always be declared as taking unspecified arguments -- () in C and (...) in C++, or to simply say that the user cannot declare such functions themselves.

The options I see are:

  1. Disallow.
  2. Run the custom typechecking.
  3. Require as taking unspecified arguments.
  4. Simply check the name.

Implementing running custom typechecking seems it'd be tricky and not worth it.
Disallowing isn't an option as a cursory search shows that this is used in the wild.
Requiring unspecified arguments isn't feasible either as the real-world usages seem to include arguments in their declarations. E.g. __va_start from MSVC's vadefs.h, which is also tested by clang/test/SemaCXX/microsoft-varargs.cpp.

I've thus gone with simply marking as them builtins if the name matches and they're extern "C". The C linkage requirement didn't exist before, but seems a safe improvement to make.

clang/lib/Headers/intrin.h
154

It does not. Removed from all.

207

I removed static from builtins that showed up as problematic in tests. But like inline, there's really no effect. I've removed both from all builtins in this header.

clang/lib/Sema/SemaDecl.cpp
9624–9650
  • You appear to be creating and throwing away a new builtin function declaration (plus parameter declarations etc) each time you see a declaration with a matching name, even if one was already created. Given that you don't actually use D for anything other than its type, creating the declaration seems redundant and using ASTContext::GetBuiltinType would be more appropriate.

Fixed.

  • There are no checks of which scope the new function is declared in; this appears to apply in all scopes, but some builtin names are only reserved in the global scope (those beginning with an underscore followed by a lowercase letter such as _bittest), so that doesn't seem appropriate. The old code in FunctionDecl::getBuiltinID checked that the declaration is given C language linkage (except for _GetExceptionInfo, which was permitted to have C++ language linkage), and that check still seems appropriate to me.

Fixed.

  • The special case permitting _GetExceptionInfo to be declared with *any* type seems suspect; the old code permitted it to have different language linkage, not the wrong type.

I've fixed the linkage part, but the old code didn't have any type checking at all so I've kept it this way.

  • Using typesAreCompatible in C++-specific code is weird, since C++ doesn't have a notion of "compatible types".

Agreed, however I was unable to come up with a better way to do this. Let me know if you have any ideas.
That said, since I've now fixed it to check for C linkage I think this is less of a problem.

clang/test/Analysis/bstring.cpp
106 ↗(On Diff #288911)

Fixed by proper extern "C" handling.

tambre edited the summary of this revision. (Show Details)Sep 2 2020, 9:57 PM

The builtins with custom type-checking are all true intrinsics like __builtin_operator_new and so on. They really can't be validly declared by the user program. The thing that seems most likely to avoid random compiler crashes would be to either forbid explicit declarations of them or treat those as no longer being builtins. If we need to maintain compatibility with people making custom declarations, we would need to always treat them as builtins and run the risk of crashing if someone declares one with a bad signature. But I don't think it's unfair of us to break those people; that is seriously not reasonable user behavior.

It's possible that custom declarations are people trying to create compatibility shims for compilers that don't provide these as builtins. Those people should be guarding their custom declarations, preferably with __has_builtin.

tambre added a comment.EditedSep 3 2020, 9:12 AM

The builtins with custom type-checking are all true intrinsics like __builtin_operator_new and so on. They really can't be validly declared by the user program. The thing that seems most likely to avoid random compiler crashes would be to either forbid explicit declarations of them or treat those as no longer being builtins. If we need to maintain compatibility with people making custom declarations, we would need to always treat them as builtins and run the risk of crashing if someone declares one with a bad signature. But I don't think it's unfair of us to break those people; that is seriously not reasonable user behavior.

It's possible that custom declarations are people trying to create compatibility shims for compilers that don't provide these as builtins. Those people should be guarding their custom declarations, preferably with __has_builtin.

I fully agree.

However, I believe you forget to account for the example that I brought up.
In particular, MSVC's header vadefs.h includes a declaration of __va_start(), which would cause almost any code including standard headers to fail to compile on Windows.
This issue isn't isolated to some old MSVC version, as the declaration is still there in the latest Visual Studio Preview version 14.28.29213.

How about turning it into an error only for non-Microsoft builtins?
Though keeping that as a followup might be even better, as this will probably be merged into 11.0.

I didn't see the specific example, sorry. I agree that my description is more applicable to builtins in the __builtin namespace, which describes most of the builtins with custom typechecking. Is the problem just __va_start?

If we have to treat all declarations as builtins for the custom-typechecking builtins just to land this patch, I don't think that's the worst result in the world, and we can incrementally go from there. __va_start actually has a signature, it just effectively has optional arguments, which is something we could definitely teach the builtins database and signature-matcher about.

tambre updated this revision to Diff 289986.Sep 4 2020, 9:43 AM

Improve comments.

tambre added a comment.EditedSep 4 2020, 9:44 AM

I didn't see the specific example, sorry. I agree that my description is more applicable to builtins in the __builtin namespace, which describes most of the builtins with custom typechecking. Is the problem just __va_start?

I grepped for all builtins with custom typechecking.
I found two occurences in MSVC's headers: __va_start and __GetExceptionInfo, but none on my Debian Unstable machine.

If we have to treat all declarations as builtins for the custom-typechecking builtins just to land this patch, I don't think that's the worst result in the world, and we can incrementally go from there. __va_start actually has a signature, it just effectively has optional arguments, which is something we could definitely teach the builtins database and signature-matcher about.

I'd still prefer we go incrementally.

It would be nice to receive another round of code review as I have fixed the issues pointed out by Richard.

rsmith added inline comments.Sep 7 2020, 1:04 AM
clang/lib/Headers/intrin.h
476

The functions with inline definitions should still be static inline so that we don't emit them as strong external defintiions.

clang/lib/Sema/SemaDecl.cpp
9629–9630

This will give the wrong answer for

extern "C" {
namespace X {
void __builtin_foo();
}
}

... which does have C language linkage. Instead, please call FunctionDecl::getLanguageLinkage(), which knows how to handle these cases.

9647–9648

Please use ASTContext::hasSameTypeIgnoringExceptionSpec instead.

clang/lib/Serialization/ASTReader.cpp
914 ↗(On Diff #289986)

We now consider getObjCOrBuiltinID() here for the IsModule case, where we didn't before. Is that an intentional change?

tambre updated this revision to Diff 290229.Sep 7 2020, 2:59 AM
tambre marked 3 inline comments as done.

Address review comments.

tambre edited the summary of this revision. (Show Details)Sep 7 2020, 3:02 AM
tambre updated this revision to Diff 290232.Sep 7 2020, 3:11 AM
tambre marked an inline comment as done.

Remove now obsolete FIXME.

tambre added a comment.Sep 7 2020, 3:11 AM

Thanks for the review. All tests still pass, should be good for another round.

clang/lib/Sema/SemaDecl.cpp
9629–9630

Good suggestion. This fixes the long-standing FIXME inherited from getBuiltinID(). I've added a test for this.

clang/lib/Serialization/ASTReader.cpp
914 ↗(On Diff #289986)

Unintentional, fixed.

aaron.ballman added inline comments.Sep 14 2020, 8:19 AM
clang/lib/Sema/SemaDecl.cpp
2058

Should we do this check *before* we create the C linkage decl spec above?

rsmith accepted this revision.Sep 14 2020, 3:10 PM

Thanks! This looks good to me (subject to Aaron's comment being addressed). Please wait a couple of days for any more comments from the other reviewers.

clang/lib/Sema/SemaDecl.cpp
9646

Please can you add a // FIXME here that we should probably only recognize this as a builtin in the scope where the MS headers actually declare it, rather than in every scope.

tambre updated this revision to Diff 291778.Sep 14 2020, 10:15 PM
tambre marked 2 inline comments as done.

Address comments.

If there are no further comments I'll commit this in a few days.

clang/lib/Sema/SemaDecl.cpp
2058

We should. However, since now only LazilyCreateBuiltin() calls this and checks these before, we can simply removed them here and pass in the QualType.

9646

Added.

This revision was not accepted when it landed; it landed in state Needs Review.Sep 17 2020, 9:29 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
jrtc27 added a subscriber: jrtc27.Sep 17 2020, 12:46 PM

If someone cares about pthread_create they might wish to follow up on my https://reviews.llvm.org/D58531, which I filed early last year to permit pthread_create to have a proper type in the syntax. It will likely need rebasing, but probably isn't that much work.

If someone cares about pthread_create they might wish to follow up on my https://reviews.llvm.org/D58531, which I filed early last year to permit pthread_create to have a proper type in the syntax. It will likely need rebasing, but probably isn't that much work.

Sorry I missed this before! Please feel free to ping patches every ~week if they're not getting attention.

dmajor added a subscriber: dmajor.Sep 18 2020, 9:40 AM

This commit broke Firefox builds on Mac with an error in the SDK headers. Could you please revert if a fix is not readily available?

Reproducer:

struct objc_super {};
extern "C" id objc_msgSendSuper(struct objc_super *super, SEL op, ...);
extern "C" void objc_msgSendSuper_stret(struct objc_super *super, SEL op, ...);

Result:

$ clang -c test.mm
test.mm:3:48: error: reference to 'objc_super' is ambiguous
extern "C" void objc_msgSendSuper_stret(struct objc_super *super, SEL op, ...);
                                               ^
test.mm:1:8: note: candidate found by name lookup is 'objc_super'
struct objc_super {};
       ^
note: candidate found by name lookup is 'objc_super'
1 error generated.

This commit broke Firefox builds on Mac with an error in the SDK headers. Could you please revert if a fix is not readily available?

Reproducer:

struct objc_super {};
extern "C" id objc_msgSendSuper(struct objc_super *super, SEL op, ...);
extern "C" void objc_msgSendSuper_stret(struct objc_super *super, SEL op, ...);

Result:

$ clang -c test.mm
test.mm:3:48: error: reference to 'objc_super' is ambiguous
extern "C" void objc_msgSendSuper_stret(struct objc_super *super, SEL op, ...);
                                               ^
test.mm:1:8: note: candidate found by name lookup is 'objc_super'
struct objc_super {};
       ^
note: candidate found by name lookup is 'objc_super'
1 error generated.

Looking into this, got an idea what might be wrong. Gimme an hour or two. :)

This commit broke Firefox builds on Mac with an error in the SDK headers. Could you please revert if a fix is not readily available?

Reproducer:

struct objc_super {};
extern "C" id objc_msgSendSuper(struct objc_super *super, SEL op, ...);
extern "C" void objc_msgSendSuper_stret(struct objc_super *super, SEL op, ...);

Result:

$ clang -c test.mm
test.mm:3:48: error: reference to 'objc_super' is ambiguous
extern "C" void objc_msgSendSuper_stret(struct objc_super *super, SEL op, ...);
                                               ^
test.mm:1:8: note: candidate found by name lookup is 'objc_super'
struct objc_super {};
       ^
note: candidate found by name lookup is 'objc_super'
1 error generated.

Looking into this, got an idea what might be wrong. Gimme an hour or two. :)

Fix in D87917.

We've hit a fairly subtle miscompile caused by this patch.

glibc's setjmp.h looks like this (irrelevant parts removed):

struct __jmp_buf_tag { /*...*/ };
extern int __sigsetjmp(struct __jmp_buf_tag __env[1], int);
typedef struct __jmp_buf_tag sigjmp_buf[1];
#define sigsetjmp __sigsetjmp

This worked fine with the old approach. But with the new approach, we decide the declaration of __sigsetjmp is not a builtin, because at its point of declaration, we can't compute the "proper" type because sigjmp_buf has not been declared yet. As a result, we don't add a BuiltinAttr to __sigsetjmp, but much more critically, we don't add a ReturnsTwiceAttr, which results in miscompiles in calls to this function. (I think sigsetjmp is the only affected function with glibc. jmp_buf is declared prior to __setjmp and friends.)

I suppose we don't actually care what the parameter types for __sigsetjmp are, and it would be fine (and much safer) to treat any function with that name as a builtin, like we used to. Perhaps we should have a way of marking builtins as "the given type is what we expect / what we will implicitly declare, but it's OK if it doesn't actually match"?

We've hit a fairly subtle miscompile caused by this patch.

glibc's setjmp.h looks like this (irrelevant parts removed):

struct __jmp_buf_tag { /*...*/ };
extern int __sigsetjmp(struct __jmp_buf_tag __env[1], int);
typedef struct __jmp_buf_tag sigjmp_buf[1];
#define sigsetjmp __sigsetjmp

This worked fine with the old approach. But with the new approach, we decide the declaration of __sigsetjmp is not a builtin, because at its point of declaration, we can't compute the "proper" type because sigjmp_buf has not been declared yet. As a result, we don't add a BuiltinAttr to __sigsetjmp, but much more critically, we don't add a ReturnsTwiceAttr, which results in miscompiles in calls to this function. (I think sigsetjmp is the only affected function with glibc. jmp_buf is declared prior to __setjmp and friends.)

I suppose we don't actually care what the parameter types for __sigsetjmp are, and it would be fine (and much safer) to treat any function with that name as a builtin, like we used to. Perhaps we should have a way of marking builtins as "the given type is what we expect / what we will implicitly declare, but it's OK if it doesn't actually match"?

Marking __sigsetjmp as having custom typechecking should suffice (t attribute in Builtins.def), no? Though a case in Sema::CheckBuiltinFunctionCall() might also then be necessary.

Being permissive about recognizing builtins when the expected signature requires a type that lookup can't find seems completely reasonable. We don't really want to force library functions to take the custom-typechecking path just because we want to infer an attribute for them.

Being permissive about recognizing builtins when the expected signature requires a type that lookup can't find seems completely reasonable. We don't really want to force library functions to take the custom-typechecking path just because we want to infer an attribute for them.

Proposed fix in https://reviews.llvm.org/D88518.

jdoerfert reopened this revision.Jul 11 2021, 8:11 PM
jdoerfert added a subscriber: jdoerfert.

A pthread_create() related test is XFAIL-ed, as it relied on it being recognized as a builtin based on its name.
The builtin declaration syntax is too restrictive and doesn't allow custom structs, function pointers, etc.
It seems to be the only case and fixing this would require reworking the current builtin syntax, so this seems acceptable.

First:
Do I assume right this this feature was simply disabled without any plan to:

  • inform the authors (me)
  • update the documentation
  • re-enable support eventually or provide alternatives

XFAILing a test and calling it a day seems inadequate IMHO.

Second:
Would an approach like this still work: https://reviews.llvm.org/D58531 ?

First:
Do I assume right this this feature was simply disabled without any plan to:

  • inform the authors (me)
  • update the documentation
  • re-enable support eventually or provide alternatives

XFAILing a test and calling it a day seems inadequate IMHO.

Second:
Would an approach like this still work: https://reviews.llvm.org/D58531 ?

Informing you would've probably been appropriate in hindsight.
I'm not aware of any relevant documentation that would've been appropriate to update.

Re-enabling the support depends on someone taking up the work to correctly implement the prototype recognition, which seems to be done in D58531 and would work as-is.
However, at the time it seemed to me that the whole builtin declaration recognition could use a rewrite to be able to this support this case without hacks and require less manual C++ type juggling.
I considered that work beyond my experience with the codebase and the benefits of this work easily seemed to outweigh the lack of additional annotations for this single function. I received no further feedback on the disabling of that test after explaining the reason.

tambre abandoned this revision.Oct 2 2021, 9:44 AM

Abandoning since I don't think there's any further review/actions to be done here.
D58531 would be the solution for the raised pthread_create() issue.

ychen added a subscriber: ychen.Oct 2 2021, 12:16 PM

Abandoning since I don't think there's any further review/actions to be done here.
D58531 would be the solution for the raised pthread_create() issue.

How about making this "Closed" instead of "Abandoned" considering the patch was actually landed?

tambre reclaimed this revision.Oct 2 2021, 12:49 PM
tambre accepted this revision.
tambre removed a reviewer: rjmccall.
This revision is now accepted and ready to land.Oct 2 2021, 12:50 PM
tambre closed this revision.Oct 2 2021, 12:50 PM

Abandoning since I don't think there's any further review/actions to be done here.
D58531 would be the solution for the raised pthread_create() issue.

How about making this "Closed" instead of "Abandoned" considering the patch was actually landed?

Good point. I assumed this wasn't possible as the web UI didn't expose an option for this.
But I have managed using arc close-revision. Needed to reclaim and get it into an accepted state to be able to do it though. Sorry for the noise.