This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/
-
clang/
-
AST/
1/1
IgnoreExpr.h
-
Basic/
1/1
Attr.td
6/6
AttrDocs.td
11/11
DiagnosticSemaKinds.td
-
Sema/
1/1
ScopeInfo.h
-
Sema.h
-
lib/
-
CodeGen/
10/11
CGCall.cpp
-
CGClass.cpp
-
CGDecl.cpp
4/4
CGExpr.cpp
-
CGExprCXX.cpp
3/3
CGStmt.cpp
-
CodeGenFunction.h
-
EHScopeStack.h
-
Sema/
2/2
JumpDiagnostics.cpp
-
Sema.cpp
68/68
SemaStmt.cpp
3/3
SemaStmtAttr.cpp
-
test/
-
CodeGenCXX/
1/1
attr-musttail.cpp
-
Sema/
-
attr-musttail.c
-
attr-musttail.m
-
SemaCXX/
-
attr-musttail.cpp

Differential D99517

Implemented [[clang::musttail]] attribute for guaranteed tail calls.
ClosedPublic

Authored by haberman on Mar 29 2021, 9:46 AM.

Download Raw Diff

Details

Reviewers

rsmith
aaron.ballman
rjmccall
varungandhi-apple
jdoerfert

Commits

rG834467590842: Implemented [[clang::musttail]] attribute for guaranteed tail calls.

Summary

This is a Clang-only change and depends on the existing "musttail"
support already implemented in LLVM.

The [[clang::musttail]] attribute goes on a return statement, not
a function definition. There are several constraints that the user
must follow when using [[clang::musttail]], and these constraints
are verified by Sema.

Tail calls are supported on regular function calls, calls through a
function pointer, member function calls, and even pointer to member.

Future work would be to throw a warning if a users tries to pass
a pointer or reference to a local variable through a musttail call.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	140 ms	x64 debian > Clang.CodeGenOpenCL::builtins-amdgcn.cl
	270 ms	x64 windows > Clang.CodeGenOpenCL::builtins-amdgcn.cl

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

rsmith added inline comments.Apr 1 2021, 12:41 PM

clang/lib/CodeGen/CGCall.cpp
5315–5317	Yes, I think we should validate this by an assertion if we can. We can check this by walking the cleanup scope stack (walk from `CurrentCleanupScopeDepth` to `EHScopeStack::stable_end()`) and making sure that there is no "problematic" enclosing cleanup scope. Here, "problematic" would mean any scope other than an `EHCleanupScope` containing only `CallLifetimeEnd` cleanups. Looking at the kinds of cleanups that we might encounter here, I think there may be a few more things that Sema needs to check in order to not get in the way of exception handling. In particular, I think we should reject if the callee is potentially-throwing and the musttail call is inside a try block or a function that's either noexcept or has a dynamic exception specification. Oh, also, we should disallow musttail calls inside statement expressions, in order to defend against cleanups that exist transiently within an expression.
clang/lib/CodeGen/CGExpr.cpp
4829	The more I think about this, the more it makes me nervous: if any of the `EmitCallExpr` functions below incidentally emit a call on the way to producing their results via the CGCall machinery, and do so without recursing through this function, that incidental call will be emitted as a tail call instead of the intended one. Specifically: I could imagine a block call involving multiple function calls, depending on the blocks ABI. I could imagine a member call performing a function call to convert from derived to virtual base in some ABIs. A CUDA kernel call in general involves calling a setup function before the actual function call happens (and it doesn't make sense for a CUDA kernel call to be a tail call anyway...) A call to a builtin can result in any number of function calls. If any expression in the function arguments emits a call without calling back into this function, we'll emit that call as a tail call instead of this one. Eg, `[[clang::musttail]] return f(dynamic_cast<T>(p));` might emit the call to `__cxa_dynamic_cast` as the tail call instead of emitting the call to `f` as the tail call, depending on whether the CGCall machinery is used when emitting the `__cxa_dynamic_cast` call. Is it feasible to sink this check into the `CodeGenFunction::EmitCall` overload that takes a `CallExpr`, `CodeGenFunction::EmitCXXMemberOrOperatorCall`, and `CodeGenFunction::EmitCXXMemberPointerCallExpr`, after we've emitted the callee and call args? It looks like we might be able to check this immediately before calling the CGCall overload of `EmitCall`, so we could pass in the 'musttail' information as a flag or similar instead of using global state in the `CodeGenFunction` object; if so, it'd be much easier to be confident that we're applying the attribute to the right call.
clang/lib/Sema/SemaStmt.cpp
594	It's a bit awkward, but I think we should delay this check until after the others -- complaining about non-trivial destruction seems beside the point if the returned value isn't a function call. Also, the diagnostic text for this error seems narrower than the cases it covers. For example: void f(const char); void g(const char s) { [[clang::musttail]] return f((s + "foo"s).c_str()); } would be diagnosed as "attribute requires that the return type and all arguments are trivially destructible", and they are; the problem is that the return value creates a temporary object with non-trivial destruction.
clang/test/CodeGen/attr-musttail.cpp
72 ↗	(On Diff #334556)	For completeness, can we also get a `CHECK-NEXT: ret void` here too?

Addressed more comments for musttail.
Reject constructors and destructors from musttail.
Fixed a few bugs and fixed the tests.
Added Obj-C test.

I added tests for all the cases you mentioned. PTAL.

clang/lib/CodeGen/CGCall.cpp
5315–5317	I'm having trouble implementing the check because there doesn't appear to be any discriminator in `EHScopeStack::Cleanup` that will let you test if it is a `CallLifetimeEnd`. (The actual code just does virtual dispatch through `EHScopeStack::Cleanup::Emit()`. I temporarily implemented this by adding an extra virtual function to act as discriminator. The check fires if a VLA is in scope: int Func14(int x) { int vla[x]; [[clang::musttail]] return Bar(x); } Do we need to forbid VLAs or do I need to refine the check? It appears that `JumpDiagnostics.cpp` is already diagnosing statement expressions and `try`. However I could not get testing to work. I tried adding a test with `try` but even with `-fexceptions` I am getting: cannot use 'try' with exceptions disabled
clang/lib/CodeGen/CGExpr.cpp
4829	Done. It's feeling like `IsMustTail`, `callOrInvoke`, and `Loc` might want to get collapsed into an options struct, especially given the default parameters on the first two. Maybe could do as a follow up?
clang/test/CodeGen/attr-musttail.cpp
72 ↗	(On Diff #334556)	I turned on LLVM verification via `opt` so I think this should get verified by the IR verifier. Is that sufficient?

Harbormaster completed remote builds in B96855: Diff 334890.Apr 1 2021, 10:01 PM

Fixed unit test by running opt in a separate invocation.
Formatting fixes.

Harbormaster completed remote builds in B96921: Diff 334985.Apr 2 2021, 11:26 AM

rsmith added a reviewer: rjmccall.Apr 2 2021, 2:00 PM

Thanks, I think this is looking really good.

@rjmccall, no explicit need to review; I just wanted to make sure you'd seen this and had a chance to express any concerns before we go ahead.

clang/include/clang/Basic/AttrDocs.td
452	One thing I'd add: If the callee is a virtual function that is implemented by a thunk, there is no guarantee in general that the thunk tail-calls the implementation of the virtual function, so such a call in a recursive cycle can still result in unbounded stack growth.
clang/lib/CodeGen/CGCall.cpp
5315–5317	Do we need to forbid VLAs or do I need to refine the check? Assuming that LLVM supports musttail calls from functions where a dynamic alloca is in scope, I think we should allow VLAs. The `musttail` documentation doesn't mention this, so I think its OK, and I can't think of a good reason why you wouldn't be able to `musttail` call due to a variably-sized frame. Perhaps a good model would be to add a virtual function to permit asking a cleanup whether it's optional / skippable. I could not get testing to work. You need `-fcxx-exceptions` to use `try`. At the `-cc1` level, we have essentially-orthogonal settings for "it's valid for exceptions to unwind through this code" (`-fexceptions`) and "C++ exception handling syntax is permitted" (`-fcxx-exceptions`), and you usually need to enable both for CodeGen tests involving exceptions.
5318–5319	Given the potential for mismatch between the JumpDiagnostics checks and this one, especially as new more exotic kinds of cleanup are added, I wonder if we should use an `ErrorUnsupported` here instead of an `assert`. I strongly suspect we can still reach the problematic case here for a tail call in a statement expression. I don't think it's feasible to check for all the ways that an arbitrary expression context can have pending cleanups, which we'd need in order to produce precise `Sema` diagnostics for that, so either we handle that here or we blanket reject all `musttail` returns in statement expressions. I think either approach is probably acceptable.
clang/lib/Sema/SemaStmt.cpp
596–599	I would have thought this assert would fire for `void f() { [[clang::musttail]] return; }`. If so, we should reject this case with a diagnostic.
601	`IgnoreUnlessSpelledInSource` is a syntactic check that's only really intended for tooling use cases; I think we want something a bit more semantic here, so `IgnoreImplicitAsWritten` would be more appropriate. I think it would be reasonable to also skip "parentheses" here (which we treat as also including things like C's `_Generic`). Would `Ex->IgnoreImplicitAsWritten()->IgnoreParens()` work? If we're going to skip elidable copy construction of the result here (which I think we should), should we also reflect that in the AST? Perhaps we should strip the return value down to being just the call expression? I'm thinking in particular of things like building in C++14 or before with `-fno-elide-constructors`, where code generation for a by-value return of a class object will synthesize a local temporary to hold the result, with a final destination copy emitted after the call. (Testcase: `struct A { A(const A&); }; A f(); A g() { [[clang::musttail]] return f(); }` with `-fno-elide-constructors`.)
609	A call expression doesn't necessarily have a known callee declaration. I would expect this assert to fire on a case like: void f() { void (*p)() = f; [[clang::musttail]] return p(); } We should reject this with a diagnostic.
632–633	Please pass in a flag here so the diagnostic can `%select` and produce a more specific description of the problem.
644	There are a couple of other contexts that can include a return statement: the caller could also be an `ObjCMethodDecl` (an Objective-C method) or a `CapturedDecl` (the body of a `#pragma omp` parallel region). I'd probably use a specific diagnostic ("cannot be used from a block" / "cannot be used from an Objective-C function") for the block and ObjCMethod case, and a nonsepcific-but-correct "cannot be used from this context" for anything else.
clang/test/CodeGen/attr-musttail.cpp
72 ↗	(On Diff #334556)	We don't like Clang's tests depending on `opt` in general, but I think in this case it's an acceptable crutch until we fix Clang to run the verifier on its IR output again (as discussed offline, it looks like we lost that as part of the transition to the new pass manager). Please add a FIXME to remove the call to `opt` once that bug is fixed. Other than that, I'm fine with this approach.

rsmith added inline comments.Apr 2 2021, 2:15 PM

clang/lib/CodeGen/CGCall.cpp
5315–5317	Or maybe instead of "is optional / skippable", the right question is, "is this redundant if we're about to return?" That way we could potentially one day reuse the same mechanism to also skip emitting such cleanups when emitting a cleanup path into the return block.

CC'ing Varun Gandhi.

Is musttail actually supported generically on all LLVM backends, or does this need a target restriction?

You should structure this code so it's easy to add exceptions for certain calling conventions that can support tail calls with weaker restrictions (principally, callee-pop conventions). Mostly that probably means checking the calling convention first, or extracting the type restriction checks into a different function that you can skip. For example, I believe x86's fastcall convention can logically support any combination of prototypes as musttail as long as the return types are vaguely compatible.

clang/lib/CodeGen/CGCall.cpp
5318–5319	Yes, I think ErrorUnsupported is a much better idea.
clang/lib/Sema/SemaStmt.cpp
644	Blocks ought to be extremely straightforward to support. Just validate that the tail call is to a block pointer and then compare the underlying function types line up in the same way. You will need to be able to verify that there isn't a non-trivial conversion on the return types, even if the return type isn't known at this point in the function, but that's a problem in C++ as well due to lambdas and `auto` deduced return types. Also, you can use `isa<...>` for checks like this instead of `dyn_cast<...>`.

In D99517#2667025, @rjmccall wrote:

You should structure this code so it's easy to add exceptions for certain calling conventions that can support tail calls with weaker restrictions (principally, callee-pop conventions). Mostly that probably means checking the calling convention first, or extracting the type restriction checks into a different function that you can skip. For example, I believe x86's fastcall convention can logically support any combination of prototypes as musttail as long as the return types are vaguely compatible.

The LLVM musttail flag doesn't seem to allow for any target-specific loosening of the rules at the moment, so I don't think we can get any benefit from such restructuring right now; do you think it's OK to defer this restructuring and use the stricter rules across all targets for now?

I think there is also value in having a target-independent set of restrictions, even if we could actually guarantee tail calls in more circumstances on some (or maybe most!) targets, in order to allow people to make portable use of the attribute and as data towards something that we might be able to standardize. (For example, the people working on coroutines in C++ wanted something like this, but wanted feedback from implementers on what set of restrictions would be necessary in order to portably guarantee a tail call.) In order to strike a balance between portability and usefulness here, maybe we could plan to eventually accept any musttail call we know the target can support, but warn on musttail calls that don't satisfy the stricter rules and therefore may be non-portable?

In D99517#2667088, @rsmith wrote:

In D99517#2667025, @rjmccall wrote:

You should structure this code so it's easy to add exceptions for certain calling conventions that can support tail calls with weaker restrictions (principally, callee-pop conventions). Mostly that probably means checking the calling convention first, or extracting the type restriction checks into a different function that you can skip. For example, I believe x86's fastcall convention can logically support any combination of prototypes as musttail as long as the return types are vaguely compatible.

The LLVM musttail flag doesn't seem to allow for any target-specific loosening of the rules at the moment, so I don't think we can get any benefit from such restructuring right now; do you think it's OK to defer this restructuring and use the stricter rules across all targets for now?

Right, I wasn't suggesting that we needed to implement weaker rules right now, just that it'd be nice if the code didn't have to be totally restructured just to do it. Right now it's one big function that does all the checks.

I think there is also value in having a target-independent set of restrictions, even if we could actually guarantee tail calls in more circumstances on some (or maybe most!) targets, in order to allow people to make portable use of the attribute and as data towards something that we might be able to standardize. (For example, the people working on coroutines in C++ wanted something like this, but wanted feedback from implementers on what set of restrictions would be necessary in order to portably guarantee a tail call.) In order to strike a balance between portability and usefulness here, maybe we could plan to eventually accept any musttail call we know the target can support, but warn on musttail calls that don't satisfy the stricter rules and therefore may be non-portable?

I agree that we should not start loosening restrictions based on the vagaries of the platform CC, e.g. recognizing that a particular set of arguments happens to be passed solely in registers. I was thinking about callee-pop CCs, like fastcall and swiftasynccall, which are generally designed from the start to support almost unrestricted tail calls; e.g. the only restriction on tail calls between fastcall functions is that the return types are compatible. (IIRC — it's possible that highly-aligned arguments would change that.) Since tail calls are part of the designed feature set of these conventions, it seems appropriate to think about them when adding a tail-call feature.

Standard C conventions generally don't support unrestricted tail calls (because of variadics, unprototyped calls, easier assembly-writing, and history), so this would only apply as a target-specific extension when used in conjunction with a non-standard CC, which meshes well with your goals for standardization. I just want you to write the code so that maintainers can more easily skip some of the restrictions to cover a non-standard CC.

I'm not surprised that the C++ coroutine people want unrestricted tail calls; this is all pretty predictable, and it's essentially the point I made about generic coroutine lowering several years ago at LLVM dev. Really, they need to be asking for a standard calling convention that guarantees unrestricted tail calls. Of course, that would require the standard to admit the existence of calling conventions (other than language linkage :)).

aaron.ballman added inline comments.Apr 3 2021, 9:36 AM

clang/include/clang/Basic/AttrDocs.td
458	It'd be nice if we could nail down "similar" somewhat. I don't know if `int` and `short` are similar (due to promotions) or `const int` and `int` are similar, etc.
461–462	Not only is this not usable with K&R C declarations, but it's also not usable with `...` variadic functions either, right?
clang/lib/Sema/SemaStmt.cpp
562–569	I disagree that `ActOnAttributedStmt()` is the correct place for this checking -- template checking should occur when the template is instantiated, same as happens for declaration attributes. I'd like to see this functionality moved to SemaStmtAttr.cpp. Keeping the attribute logic together and following the same patterns is what allows us to tablegenerate more of the attribute logic. Statement attributes are just starting to get more such automation.

Addressed comments and tried moving check to SemaStmtAttr.cpp.

clang/lib/Sema/SemaStmt.cpp
562–569	I tried commenting out this code and adding the following code into `handleMustTailAttr()` in `SemaStmtAttr.cpp`: if (!S.checkMustTailAttr(St, MTA)) return nullptr; This caused my test cases related to templates to fail. It also seemed to break test cases related to `JumpDiagnostics`. My interpretation of this is that `handleMustTailAttr()` is called during parsing only, and cannot catch errors at template instantiation time or that require a more complete AST. What am I missing? Where in SemaStmtAttr.cpp are you suggesting that I put this check?
601	`IgnoreImplicitAsWritten()` doesn't skip `ExprWithCleanups`, and per your previous comment I was trying to find a `CallExpr` before doing the check prohibiting `ExprWithCleanups` with side effects. I could write some custom ignore logic using `clang::IgnoreExprNodes()` directly. If we're going to skip elidable copy construction of the result here (which I think we should) To clarify, are you suggesting that we allow `musttail` through elidable copy constructors on the return value, even if `-fno-elide-constructors` is set? ie. we consider that `musttail` overrides the `-fno-elide-constructors` option on the command line?

Added missing S.setFunctionHasMustTail().

haberman added inline comments.Apr 3 2021, 1:41 PM

clang/lib/Sema/SemaStmt.cpp
562–569	Scratch the part about `JumpDiagnostics`, that was me failing to call `S.setFunctionHasMustTail()`. I added that and now the `JumpDiagnostics` tests pass. But the template test cases still fail, and I can't find any hook point in `SemaStmtAttr.cpp` that will let me evaluate these checks at template instantiation time.

Harbormaster completed remote builds in B97009: Diff 335103.Apr 3 2021, 2:02 PM

Harbormaster completed remote builds in B97012: Diff 335106.Apr 3 2021, 2:44 PM

aaron.ballman added inline comments.Apr 4 2021, 6:30 AM

clang/lib/Sema/SemaStmt.cpp
562–569	I think there's a bit of an architectural mixup, but I'm curious if @rsmith agrees before anyone starts doing work to make changes. When transforming declarations, `RebuildWhatever()` calls the `ActOnWhatever()` function which calls `ProcessDeclAttributeList()` so that attributes are processed. `RebuildAttributedStmt()` similarly calls `ActOnAttributedStmt()`. However, `ActOnAttributedStmt()` doesn't call `ProcessStmtAttributes()` -- the logic is reversed so that `ProcessStmtAttributes()` is what calls `ActOnAttributedStmt()`. I think the correct answer is to switch the logic so that `ActOnAttributedStmt()` calls `ProcessStmtAttributes()`, then the template logic should automatically work.

haberman added inline comments.Apr 4 2021, 10:31 AM

clang/lib/Sema/SemaStmt.cpp
562–569	I think the correct answer is to switch the logic so that ActOnAttributedStmt() calls ProcessStmtAttributes() I think this would require `ProcessStmtAttributes()` to be split into two separate functions. Currently that function is doing two separate things: Translation of `ParsedAttr` into various subclasses of `Attr`. Validation that the attribute is semantically valid. The function signature for `ActOnAttributedStmt()` uses `Attr` (not `ParsedAttr`), so (1) must happen during the parse, before `ActOnAttributedStmt()` is called. But (2) must be deferred until template instantiation time for some cases, like `musttail`.

aaron.ballman added inline comments.Apr 5 2021, 7:30 AM

clang/lib/Sema/SemaStmt.cpp
562–569	I don't think the signature for `ActOnAttributedStmt()` is correct to use `Attr` instead of `ParsedAttr`. I think it should be `StmtResult ActOnAttributedStmt(const ParsedAttributesViewWithRange &AttrList, Stmt *SubStmt);` -- this likely requires a fair bit of surgery to make work though, which is why I'd like to hear from @rsmith if he agrees with the approach. In the meantime, I'll play around with this idea locally in more depth.

aaron.ballman added inline comments.Apr 5 2021, 12:15 PM

clang/lib/Sema/SemaStmt.cpp
562–569	I think my suggestion wasn't quite right, but close. I've got a patch in progress that changes this the way I was thinking it should be changed, but it won't call `ActOnAttributedStmt()` when doing template instantiation. Instead, it will continue to instantiate attributes explicitly by calling `TransformAttr()` and any additional instantiation time checks will require you to add a `TreeTransfor::TransformWhateverAttr()` to do the actual instantiation work (which is similar to how the declaration attributes work in `Sema::InstantiateAttrs()`). I hope to put up a patch for review for these changes today or tomorrow. It'd be interesting to know whether they make your life easier or harder though, if you don't mind taking a look and seeing how well (or poorly) they integrate with your changes here.

aaron.ballman added inline comments.Apr 5 2021, 1:08 PM

clang/lib/Sema/SemaStmt.cpp
562–569	You can find that review at https://reviews.llvm.org/D99896.

rsmith added inline comments.Apr 5 2021, 1:10 PM

clang/lib/Sema/SemaStmt.cpp
562–569	I think the ideal model would be that we form a `FooAttr` from the user-supplied attribute description in an `ActOn` function from the parser, and have a separate template instantiation mechanism to instantiate `FooAttr` objects, and those methods are unaware of the subject of the attribute. Then we have a separate mechanism to attach an attribute to its subjects that is used by both parsing and template instantiation. But I suspect there are reasons that doesn't work in practice -- where we need to know something about the subject in order to know how to form the `FooAttr`. That being the case, it probably makes most sense to model the formation and application of a `FooAttr` as a single process. it won't call `ActOnAttributedStmt()` when doing template instantiation Good -- not calling `ActOn` during template instantiation is the right choice in general -- the `ActOn` functions are only supposed to be called from parsing, with a `Build` added if the parsing and template instantiation paths would share code (we sometimes shortcut that when the `ActOn` and `Build` would be identical, but I think that's turned out to be a mistake). any additional instantiation time checks will require you to add a `TreeTransform::TransformWhateverAttr()` to do the actual instantiation work That sounds appropriate to me in general. Are you expecting that this function would also be given the (transformed and perhaps original) subject of the attribute?
601	`IgnoreImplicitAsWritten()` doesn't skip `ExprWithCleanups` That sounds like a bug. Are you sure? It looks like `IgnoreImplicitAsWrittenSingleStep` calls `IgnoreImplicitSingleStep` which calls `IgnoreImplicitCastsSingleStep` which skips `FullExpr`, and `ExprWithCleanups` is a kind of `FullExpr`. To clarify, are you suggesting that we allow `musttail` through elidable copy constructors on the return value, even if `-fno-elide-constructors` is set? ie. we consider that `musttail` overrides the `-fno-elide-constructors` option on the command line? Yes, I think the `musttail` attribute should override `-fno-elide-constructors`, because that's necessary in order to provide the tail call the user requested (and the local setting should override the global one). This is probably worth adding to the documentation. (Also, `-fno-elide-constructors` is only supposed to affect code generation, not language semantics or program validity, so I think either we should always reject if a constructor call is required for the return value, regardless of whether it's elidable, or we should never reject in that case, and either way this determination should be made independent of the setting of `-fno-elide-constructors`. Given that choice, it seems more useful to bias towards the common case (`-felide-constructors`).)

haberman added inline comments.Apr 5 2021, 3:29 PM

clang/lib/Sema/SemaStmt.cpp
562–569	Would it be possible to defer that refactoring until after this change is in? There are a lot of other issues to resolve on this review as it is, and throwing a potential refactoring into the mix is making it a lot harder to get this into a state where it can be landed. Once it's in I'm happy to collaborate on the other review.

aaron.ballman added inline comments.Apr 6 2021, 4:26 AM

clang/lib/Sema/SemaStmt.cpp
562–569	I'm fine with that -- my suggestion would be to ignore the template instantiation validation for the moment (add tests with FIXME comments where the behavior isn't what you want) and then when I get you the functionality you need to have more unified checking, you can refactor it at that time.

Returned validation to ActOnAttributedStmt() so it works with templates.
Merge branch 'main' into musttail
Address more review comments.

Herald added a reviewer: jdoerfert. · View Herald TranscriptApr 7 2021, 10:42 PM

Herald added a subscriber: sstefan1. · View Herald Transcript

haberman added inline comments.Apr 7 2021, 10:45 PM

clang/include/clang/Basic/AttrDocs.td
458	Done. I tried to summarize the C++ concept of "similar" types as defined in https://eel.is/c++draft/conv.qual#2 and implemented in https://clang.llvm.org/doxygen/classclang_1_1ASTContext.html#a1b1b3b7a67a30fd817ba85454780d8ad
clang/lib/Sema/SemaStmt.cpp
562–569	I would strongly prefer to submit correct code (that validates templates) and leave a FIXME to make it pretty, rather than submit pretty code and leave a FIXME to make it correct.
609	I think this case will work actually, the callee decl in this case is just the function pointer, which seems appropriate and type checks correctly. I added a test for this.
644	Tail calls to a block are indeed straightforward and are handled below. This check is for tail calls from a block, which I tried to add support for but didn't have much luck (in particular, during parsing of a block I wasn't able to get good type information for the block). I'd probably use a specific diagnostic ("cannot be used from a block" / "cannot be used from an Objective-C function") for the block and ObjCMethod case, and a nonsepcific-but-correct "cannot be used from this context" for anything else. I implemented this as requested. I wasn't able to test OpenMP as you apparently can't return from an OpenMP block.

Harbormaster completed remote builds in B97655: Diff 336004.Apr 7 2021, 11:36 PM

aaron.ballman added inline comments.Apr 8 2021, 7:29 AM

clang/lib/Sema/SemaStmt.cpp
562–569	I'm okay with that so long as the follow-up work actually happens (not to suggest that you plan to ignore the request!). "This is functional but not pretty" has a risk of becoming enshrined behavior as priorities shift, whereas "this is incomplete" generally does not. Please add a FIXME comment here just to make sure it's clear we want the code to move in the future.

Added FIXME for attribute refactoring.

Factored duplicated code into a method on MustTailAttr.

haberman added inline comments.Apr 8 2021, 9:15 AM

clang/lib/Sema/SemaStmt.cpp
562–569	I added a FIXME. Just to set expectations, I'm happy to work with you on updating this code to fit your planned refactoring (either by offering comments/suggestions on a review by you or creating my own follow-up review per your suggestions). But I'll need a fair amount of input from you, since I don't fully grok what you find objectionable about the current code or what your desired end state is.

aaron.ballman added inline comments.Apr 8 2021, 9:34 AM

clang/lib/Sema/SemaStmt.cpp
562–569	Thanks for the FIXME. I'm totally happy to iterate with you on the refactoring. Mostly, it involves testing whether https://reviews.llvm.org/D99983 provides you with enough contextual information when performing template instantiation for you to be able to put the attribute checking logic into the right places. The objectionable bit about the current approach is that `ActOnAttributedStmt()`/`BuildAttributedStmt()` are general functions for attributed statements that should not be doing per-attribute diagnostic work (this won't scale well as more statement attributes get added). My preferred approach based on what you have already is to call `checkMustTailAttr()` from `handleMustTailAttr()`, and call it from `TreeTransform.h` in a new `TransformMustTailAttr()` function when doing template instantiation (this part is what requires the other patch to land first).

Moved calling convention check to happen as early as possible.

In D99517#2667025, @rjmccall wrote:

You should structure this code so it's easy to add exceptions for certain calling conventions that can support tail calls with weaker restrictions (principally, callee-pop conventions). Mostly that probably means checking the calling convention first, or extracting the type restriction checks into a different function that you can skip. For example, I believe x86's fastcall convention can logically support any combination of prototypes as musttail as long as the return types are vaguely compatible.

I moved the calling convention check to be as early as possible.

Harbormaster completed remote builds in B97751: Diff 336130.Apr 8 2021, 10:14 AM

Harbormaster completed remote builds in B97761: Diff 336141.Apr 8 2021, 10:52 AM

Harbormaster completed remote builds in B97769: Diff 336153.Apr 8 2021, 11:20 AM

Formatted files with clang-format.

haberman marked 2 inline comments as done.Apr 8 2021, 1:23 PM

haberman added inline comments.

clang/lib/Sema/SemaStmt.cpp
562–569	Sounds good. I will follow up with you on https://reviews.llvm.org/D99983.

haberman marked an inline comment as done.Apr 8 2021, 1:24 PM

haberman mentioned this in D100138: Debug Go attribute test.Apr 8 2021, 2:03 PM

Harbormaster completed remote builds in B97802: Diff 336203.Apr 8 2021, 2:03 PM

rsmith added inline comments.Apr 8 2021, 4:11 PM

clang/include/clang/Basic/DiagnosticSemaKinds.td
2828–2829	Can we somehow avoid talking about ARC where it's not relevant? While it'd be nice to be more precise here, my main concern is that we shouldn't be mentioning ARC to people for whom it's not a meaningful term (eg, when not compiling Objective-C or Objective-C++). Perhaps the simplest approach would be to only mention ARC if `getLangOpts().ObjCAutoRefCount` is set?
clang/lib/AST/AttrImpl.cpp
221–226 ↗	(On Diff #336203)	`IgnoreImplicitAsWritten` should already skip over implicit elidable constructors, so I would imagine this is skipping over elidable explicit constructor calls (eg, `[[musttail]] return T(make());` would perform a tail-call to `make()`). Is that what we want?
clang/lib/CodeGen/CGStmt.cpp
668	In the case where we're forcibly eliding a constructor, we'll need to emit a return statement that returns `musttail` call expression here rather than emitting the original substatement. Otherwise the tail call we emit will be initializing a local temporary rather than initializing our return slot. Eg, given: struct A { A(const A&); ~A(); char data[32]; }; A f(); A g() { [[clang::musttail]] return f(); } under `-fno-elide-constructors` when targeting C++11, say, we'll normally lower that into something like: void f(A return_slot); void g(A return_slot) { A temporary; //uninitialized f(&temporary); // call f A::A(return_slot, temporary); // call copy constructor to copy into return slot } ... and with the current patch, it looks like we'll add a 'ret void' after the call to `f`, leaving `g`'s return slot uninitialized and passing an address into `f` that refers to a variable that will no longer exist once `f` is called. We need to instead lower to: void f(A return_slot); void g(A return_slot) { f(return_slot); // call f } Probably the easiest way to do this would be to change the return value on the `ReturnStmt` to be the tail-called `CallExpr` when attaching the attribute.

Refined the implicit constructor skipping code.

clang/include/clang/Basic/DiagnosticSemaKinds.td
2828–2829	I implemented this but I couldn't figure out how to actually trigger the ARC case, so I just removed that part of the diagnostic text for now.
clang/lib/AST/AttrImpl.cpp
221–226 ↗	(On Diff #336203)	As discussed offline, it appears that `IgnoreImplicitAsWritten()` was not skipping the implicit constructor in this case. Per our discussion, I created a new version of `IgnoreImplicitAsWritten()` that does, with a FIXME to land it in `Expr`, and I made it skip implicit constructors only (and added tests for this case).
clang/lib/CodeGen/CGStmt.cpp
668	Done. I had to change your test case to remove the destructor, otherwise it fails the trivial destruction check. Take a look at the CodeGen tests and see if the output looks correct to you.

Harbormaster completed remote builds in B97878: Diff 336310.Apr 8 2021, 9:40 PM

Rename and refine IgnoreElidableImplicitConstructorSingleStep().

Harbormaster completed remote builds in B97881: Diff 336316.Apr 8 2021, 11:40 PM

Mostly just nits from me, but the attribute portions look good to me.

clang/include/clang/AST/IgnoreExpr.h
127
clang/lib/Sema/SemaStmt.cpp
628
636–637	This worries me slightly -- not all `CallExpr` objects have a callee declaration (https://github.com/llvm/llvm-project/blob/main/clang/lib/AST/Expr.cpp#L1367). That said, I'm struggling to come up with an example that isn't covered so this may be fine.
641
655
659
682
700	It'd be better not to go through the cast machinery twice -- you cast to the `MemberPointerType` and then cast to the same thing again (but in a different way).
clang/lib/Sema/SemaStmtAttr.cpp
214	This can be removed entirely.

Simplified some casts and type declarations.

clang/lib/Sema/SemaStmt.cpp
636–637	That was my experience too, I wasn't able to find a case that isn't covered. I tried to avoid adding any diagnostics that I didn't know how to trigger or test.
700	I changed to `auto`, but I can't tell if you have another suggestion here also. I can't see how any of these casts can be removed.

Harbormaster completed remote builds in B98031: Diff 336511.Apr 9 2021, 11:24 AM

aaron.ballman added inline comments.Apr 9 2021, 12:12 PM

clang/lib/Sema/SemaStmt.cpp
697–699	I'm not certain if I should take a shower after writing that code or not, but it's one potential way not to perform the cast twice. If that code is too odious for others, we should at least change the `dyn_cast<>` in the `else if` to be an `isa<>`.

Switch to isa<> for type check.
Merge branch 'main' into musttail

haberman marked an inline comment as done.Apr 12 2021, 10:42 AM

haberman added inline comments.

clang/lib/Sema/SemaStmt.cpp
697–699	I changed `dyn_cast<>` to `isa<>`. If @rsmith concurs about the `dyn_cast_or_null<>` variant I'll switch to that.

Harbormaster completed remote builds in B98311: Diff 336894.Apr 12 2021, 12:22 PM

rsmith added inline comments.Apr 12 2021, 4:17 PM

clang/lib/Sema/SemaStmt.cpp
603–604	I think this would be clearer, assuming it's equivalent (and if it's not equivalent, I think it'd be useful to include a comment explaining why).
605–609	This loop is problematic: it's generally not safe to modify an expression that is used as a subexpression of another expression. (Modifying the `ReturnStmt` is, by contrast, much less problematic because the properties of a statement have less complex dependencies on the properties of its subexpressions.) In particular, if there were any implicit conversions here that changed the type or value category or similar, the enclosing parentheses would have the wrong type / value category / similar. Also there are possibilities here other than `CallExpr` and `ParenExpr`, such as anything else that we consider to be "parentheses" (such as a `GenericSelectionExpr`). But I think this loop should never be necessary, because all implicit conversions should always be on the outside of the parentheses. Do you have a testcase that needs it?
618	... would be more in line with our normal idioms.
636–637	This assert is incorrect. It would fail for a case like: using T = int(); T *f(); int g() { [[clang::musttail]] return f()(); } ... where there is no declaration associated with the function pointer returned by `f()`. I think instead of looking for a callee declaration, you should instead inspect the callee expression. You can distinguish between a member function call and a non-member call by looking at the type of the callee. Perhaps the simplest way would be to distinguish between three cases: (1) There is a callee declaration, which is a member function: this is a direct call to a member function; you can use the type of the callee declaration for your check. (2) The callee expression is (after skipping parens) a pointer-to-member access operator (`BinaryOperator::isPtrMemOp`); you can use the type of the RHS operand (which will be a pointer to member function) for your check. (3) Anything else: this is a non-member-function call, and you can directly inspect the type of the callee without caring about the callee declaration. (You might still find the type is not a function type at this stage, which indicates this is some kind of special form. In particular, it could be a `BuiltinType::BoundMember` for a pseudo-destructor call. I'm not sure if there are currently any other special cases that make it this far; there might not be, because most such cases are dependent.)
687	Use `getAs` rather than `dyn_cast` to look through type sugar. For example, in void (f)() { [[clang::musttail]] return f(); } ... the type of `f` is a `ParenType`, not a `FunctionProtoType`.
697	You need to use `getAs<MemberPointerType>` here not `isa` in order to look through type sugar (eg, typedefs). However, as noted above, a call via a member pointer doesn't necessarily have a `CalleeDecl`, so you'll need to do this check by looking for a callee expression that's the right kind of `BinaryOperator` instead.
clang/test/CodeGen/attr-musttail.cpp
1 ↗	(On Diff #336894)	This is a C++ test so it should be in `CodeGenCXX`.
178–181 ↗	(On Diff #336894)	It turns out that we consider `p` to be the callee decl in this case, so we'll need a better example :)
194 ↗	(On Diff #336894)	This doesn't include enough of the output to be able to tell if we've generated correct code. Can you also include the `define ...` line, showing that `%agg.result` is the name of the first parameter?
clang/test/Sema/attr-musttail.cpp
1 ↗	(On Diff #336894)	This should be in `SemaCXX`.
66 ↗	(On Diff #336894)	Please add a FIXME to this; it seems like a bug that we can't tell the difference between needing to run a destructor for the return value and needing to run a destructor for some other temporary created in the return statement.
78 ↗	(On Diff #336894)	The "is a member of different class (expected `void`" seems surprising here. Can we customize the diagnostic to instead say that we can't `musttail` from a non-member to a member (and vice versa for the other case)?
167–171 ↗	(On Diff #336894)	Please also test the pseudo-destructor case: void f() { int n; using T = int; [[clang::musttail]] return n.~T(); }

Addressed more review comments.

clang/lib/Sema/SemaStmt.cpp
605–609	I removed it and my test cases still pass. I'm glad to know this isn't necessary: I was coding defensively because I didn't know that I could count on this invariant: all implicit conversions should always be on the outside of the parentheses.

Functionally this looks good to me. I've suggested some minor cleanups and I understand you're doing some wordsmithing on the diagnostics; I think once those are complete this will be ready to land. Thank you!

clang/lib/CodeGen/CGExpr.cpp
4829	I agree, that sounds like a nice cleanup. Delaying this to a future change makes sense to me.
clang/lib/Sema/SemaStmt.cpp
616	You shouldn't need the `const` in the argument to `cast`, and we generally omit it; `cast` copies the pointer/referenceness and qualifiers from its argument anyway, and the explicit `const` in the type of `R` seems sufficient for readers. (I'm not even sure if `cast` intends to permit explicit qualfiiers here.)
664	I think this `isa<CapturedDecl>` check is redundant, because a `CapturedDecl` is not a `FunctionDecl`, so `CallerDecl` will always be null when `CurContext` is a `CapturedDecl`.
711	Even in invalid code we should never see a `CallExpr` whose callee has a null type; if `Sema` can't form an `Expr` that meets the normal expression invariants during error recovery, it doesn't build one at all. I think you can remove this `if`.
771	Given that we don't care about differences in qualifiers, it might be clearer to not include them in the diagnostics.
clang/test/CodeGenCXX/attr-musttail.cpp
213

Nice new feature! Please also update Release Notes for clang.

haberman added inline comments.Apr 13 2021, 4:20 PM

clang/lib/Sema/SemaStmt.cpp

711

Without this if(), I crash on this test case. What do you think?

struct TestBadPMF {
  int (TestBadPMF::*pmf)();
  void BadPMF() {
    [[clang::musttail]] return ((*this)->*pmf)(); // expected-error {{left hand operand to ->* must be a pointer to class compatible with the right hand operand, but is 'TestBadPMF'}}
  }
};

Dump of CalleeExpr is:

RecoveryExpr 0x106671e8 '<dependent type>' contains-errors lvalue
|-ParenExpr 0x10667020 'struct TestBadPMF' lvalue
| `-UnaryOperator 0x10667008 'struct TestBadPMF' lvalue prefix '*' cannot overflow
|   `-CXXThisExpr 0x10666ff8 'struct TestBadPMF *' this
`-MemberExpr 0x10667050 'int (struct TestBadPMF::*)(void)' lvalue ->pmf 0x10666ed0
  `-CXXThisExpr 0x10667040 'struct TestBadPMF *' implicit this

rsmith added inline comments.Apr 13 2021, 4:56 PM

clang/lib/Sema/SemaStmt.cpp
711	Ah, right, while the callee will always have a non-null type, that type might not be a pointer type. I think what we're missing here is a check for a dependent callee; checking for a dependent context isn't enough to check for error-dependent constructs. Probably the simplest thing would be to change the `isDependentContext()` checks to also check if the return expression `isInstantiationDependent()`. (That would only help with the error-dependent cases for now, but we'd also need that extra check in the future if anything like http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2277r0.html goes forward, allowing dependent constructs in non-dependent contexts, especially in combination with http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1306r1.pdf.)

Harbormaster completed remote builds in B98552: Diff 337252.Apr 13 2021, 7:36 PM

Word-smithed diagnostics and addressed other review comments.

More diagnostic wordsmithing.

Harbormaster completed remote builds in B98779: Diff 337576.Apr 14 2021, 4:59 PM

Harbormaster completed remote builds in B98781: Diff 337581.Apr 14 2021, 5:30 PM

Added release note for [[clang::musttail]].

Fixed release note escaping.

Fixed several cases in CodeGen test.

Thanks, cool :)

Fixed typo in comment.

Ok I think this is ready to land.

There are a few FIXME comments, I will follow up with some small changes to address them.

Harbormaster completed remote builds in B98787: Diff 337589.Apr 14 2021, 6:31 PM

Harbormaster completed remote builds in B98790: Diff 337592.Apr 14 2021, 6:54 PM

Harbormaster completed remote builds in B98788: Diff 337590.

Harbormaster completed remote builds in B98795: Diff 337597.Apr 14 2021, 7:20 PM

rsmith accepted this revision.Apr 15 2021, 4:47 PM

This revision is now accepted and ready to land.Apr 15 2021, 4:47 PM

This revision was landed with ongoing or failed builds.Apr 15 2021, 5:13 PM

Closed by commit rG834467590842: Implemented [[clang::musttail]] attribute for guaranteed tail calls. (authored by haberman, committed by rsmith). · Explain Why

This revision was automatically updated to reflect the committed changes.

rsmith added a commit: rG834467590842: Implemented [[clang::musttail]] attribute for guaranteed tail calls..

Looks like this breaks tests on mac/arm: http://45.33.8.238/macm1/7552/step_7.txt

Please take a look and revert for now if it takes a while to fix.

In D99517#2693418, @thakis wrote:

Looks like this breaks tests on mac/arm: http://45.33.8.238/macm1/7552/step_7.txt

Should be fixed by rGf7c9de0de5804498085af973dc6bfc934a18f000.

That is a great feature, thank you. Compiling state machines and scheme programs to C is now much prettier.

The error message here is very confusing:

/home/theraven/snmalloc2/src/mem/../ds/../aal/../ds/defines.h:122:27: error: cannot perform a tail call to function 'error' because its signature is incompatible with the calling function
      [[clang::musttail]] return snmalloc::error(str);
                          ^
/home/theraven/snmalloc2/src/mem/../ds/../aal/../ds/defines.h:63:16: note: target function has different number of parameters (expected 2 but has 1)
  [[noreturn]] SNMALLOC_COLD void error(const char* const str);
               ^
/home/theraven/snmalloc2/src/mem/../ds/../aal/../ds/defines.h:21:25: note: expanded from macro 'SNMALLOC_COLD'
#  define SNMALLOC_COLD __attribute__((cold))
                        ^
/home/theraven/snmalloc2/src/mem/../ds/../aal/../ds/defines.h:122:9: note: tail call required by 'musttail' attribute here
      [[clang::musttail]] return snmalloc::error(str);
        ^

The caller and callee both have one argument, the error is because the enclosing function has two parameters. The error appears wrong anyway for two reasons in this particular context:

The callee is [[noreturn]], so the stack layout doesn't make any difference, anything can be tail called if it's no-return.
The enclosing function is always_inline, so checking its argument-frame layout does not give useful information because it's the caller's argument-frame layout that matters.

@theraven: Can you post a minimal repro of your case? I don't follow your distinction between "caller" and "enclosing function."

Regarding noreturn and always_inline: maybe the rules for musttail could be relaxed in cases like the one you mention, but it would require changing the backend (LLVM). Here I changed the front-end only and used LLVM's existing musttail support, which meant accepting its existing limitations.

I would love to see an exception for always_inline: my use case would benefit greatly from this. In my own project I had to change a bunch of always_inline functions to macros to work around this rule. Unfortunately this is complicated by the fact that always_inline does not actually guarantee that inlining occurs.

Here's a minimal test:

void tail(int, float);

__attribute__((always_inline))
void caller(float x)
{
  [[clang::musttail]]
  return tail(42, x);
}

void outer(int x, float y)
{
        return caller(y);
}

This raises this error:

tail.cc:7:3: error: cannot perform a tail call to function 'tail' because its signature is incompatible with the calling function
  return tail(42, x);
  ^
tail.cc:1:1: note: target function has different number of parameters (expected 1 but has 2)
void tail(int, float);
^
tail.cc:6:5: note: tail call required by 'musttail' attribute here
  [[clang::musttail]]
    ^

There's also an interesting counterexample:

void tail(int, float);

__attribute__((always_inline))
void caller(int a, float x)
{
  [[clang::musttail]]
  return tail(a, x);
}

void outer(float y)
{
        return caller(42, y);
}

This *is* accepted by clang, but then generates this IR at -O0:

define dso_local void @_Z5outerf(float %0) #2 {
  %2 = alloca i32, align 4
  %3 = alloca float, align 4
  %4 = alloca float, align 4
  store float %0, float* %4, align 4
  %5 = load float, float* %4, align 4
  store i32 42, i32* %2, align 4
  store float %5, float* %3, align 4
  %6 = load i32, i32* %2, align 4
  %7 = load float, float* %3, align 4
  call void @_Z4tailif(i32 %6, float %7)
  ret void
}

And this IR at -O1:

; Function Attrs: uwtable mustprogress
define dso_local void @_Z5outerf(float %0) local_unnamed_addr #2 {
  call void @_Z4tailif(i32 42, float %0)
  ret void
}

Note that in both cases, the alway-inline attribute is respected (even at -O0, the always-inline inliner runs) but the musttail annotation is lost. The inlining has inserted the call into a function with a different set of parameters and so it cannot have a musttail IR annotation.

mizvekov mentioned this in D105807: [X86] pr51000 in-register struct return tailcalling.Jul 12 2021, 8:11 AM

Please also see https://bugs.llvm.org/show_bug.cgi?id=51416

It's not generically true that "anything can be tail-called if it's noreturn". For one, noreturn doesn't imply that the function doesn't exit by e.g. throwing or calling longjmp. For another, the most important user expectation of tail calls is that a long series of tail calls will exhibit zero overall stack growth; in a caller-pop calling convention, calling a function with more parameters may require growing the argument area in a way that cannot be reversed, so e.g. a long sequence of tail calls alternating between 1-argument and 2-argument functions will eventually exhaust the stack, which violates that user expectation.

chfast added a subscriber: chfast.Jan 9 2022, 4:59 AM

chfast added inline comments.

clang/lib/CodeGen/CGCall.cpp
5319	I reported a related issue. I wander if this is easy to fix. https://github.com/llvm/llvm-project/issues/53087.

Revision Contents

Path

Size

clang/

include/

clang/

AST/

IgnoreExpr.h

12 lines

Basic/

Attr.td

6 lines

AttrDocs.td

26 lines

DiagnosticSemaKinds.td

47 lines

Sema/

ScopeInfo.h

22 lines

Sema.h

13 lines

lib/

CodeGen/

24 lines

2 lines

1 line

3 lines

6 lines

14 lines

14 lines

2 lines

Sema/

38 lines

5 lines

271 lines

8 lines

test/

CodeGenCXX/

attr-musttail.cpp

214 lines

Sema/

attr-musttail.c

15 lines

attr-musttail.m

26 lines

SemaCXX/

attr-musttail.cpp

269 lines

Commit	Tree	Parents	Author	Summary	Date
93f744085c1f	cfe9254bbfaf	cc285cd7bf69	Joshua Haberman	More diagnostic wordsmithing.	Apr 14 2021, 4:58 PM
cc285cd7bf69	887d88e1bc62	377a083c2a8f c609d5336344	Joshua Haberman	Merge branch 'main' into musttail	Apr 14 2021, 3:32 PM
377a083c2a8f	b8e24e07962b	cb1cf22ffd17	Joshua Haberman	Word-smithed diagnostics and addressed other review comments.	Apr 14 2021, 3:24 PM
cb1cf22ffd17	f0377f05acf9	f32f034983f4	Joshua Haberman	Addressed more review comments.	Apr 13 2021, 1:45 PM
f32f034983f4	b8d437862dcb	220ece60c3be 8508a63b887e	Joshua Haberman	Merge branch 'main' into musttail	Apr 12 2021, 10:33 AM
220ece60c3be	4a385af6bbd5	b4d319c5e665	Joshua Haberman	Switch to isa<> for type check.	Apr 12 2021, 10:32 AM
b4d319c5e665	0b9599fcb322	40aba23e1c27	Joshua Haberman	Simplified some casts and type declarations.	Apr 9 2021, 10:26 AM
40aba23e1c27	bb07b073a020	b7e703f664e9	Joshua Haberman	Rename and refine IgnoreElidableImplicitConstructorSingleStep().	Apr 8 2021, 10:32 PM
b7e703f664e9	430e20259c70	481a242e5a0e	Joshua Haberman	Refined the implicit constructor skipping code.	Apr 8 2021, 9:24 PM
481a242e5a0e	a40c19b9a9fe	a6bbaddb8704	Joshua Haberman	Formatted files with clang-format.	Apr 8 2021, 1:11 PM
a6bbaddb8704	442f6bb0a5cc	f824647c46cf	Joshua Haberman	Moved calling convention check to happen as early as possible.	Apr 8 2021, 9:43 AM
f824647c46cf	087d8ec0ff1f	617c62988c8b	Joshua Haberman	Factored duplicated code into a method on MustTailAttr.	Apr 8 2021, 9:06 AM
617c62988c8b	b6a0ac4ba31b	347990f04816	Joshua Haberman	Added FIXME for attribute refactoring.	Apr 8 2021, 8:39 AM
347990f04816	922ea348ea5b	942bfc6291f7	Joshua Haberman	Address more review comments.	Apr 7 2021, 10:41 PM
942bfc6291f7	cf0b74740332	c5cd4538dcfd 5c8462b5daa2	Joshua Haberman	Merge branch 'main' into musttail	Apr 7 2021, 11:23 AM
c5cd4538dcfd	eebb5e03f930	02d3160f0c23	Joshua Haberman	Returned validation to ActOnAttributedStmt() so it works with templates.	Apr 7 2021, 11:13 AM
02d3160f0c23	d3fd7f20ad16	93c4c67b230e	Joshua Haberman	Added missing S.setFunctionHasMustTail().	Apr 3 2021, 1:39 PM
93c4c67b230e	33a1d6c0d167	2d340aa54bec	Joshua Haberman	Addressed comments and tried moving check to SemaStmtAttr.cpp.	Apr 3 2021, 1:14 PM
2d340aa54bec	2b0983bbf3e7	053b84204c01	Joshua Haberman	Formatting fixes.	Apr 2 2021, 10:27 AM
053b84204c01	5a1342b2e510	25c76faef92d	Joshua Haberman	Fixed unit test by running `opt` in a separate invocation.	Apr 2 2021, 10:02 AM
25c76faef92d	653aa9de7818	ce932a437112	Joshua Haberman	Added Obj-C test.	Apr 1 2021, 9:13 PM
ce932a437112	f8f29a02192d	184930dc44b6	Joshua Haberman	Fixed a few bugs and fixed the tests.	Apr 1 2021, 9:12 PM
184930dc44b6	60061d2d440a	2d640d3e923f	Joshua Haberman	Reject constructors and destructors from musttail.	Apr 1 2021, 8:04 PM
2d640d3e923f	65b691346f5c	90964be326de	Joshua Haberman	Addressed more comments for musttail.	Apr 1 2021, 7:12 PM
90964be326de	f6ec6779de2d	beae38d525c8	Joshua Haberman	Expanded and refined the semantic checks for musttail, per CR feedback.	Mar 31 2021, 3:55 PM
beae38d525c8	d45d07365593	57ac0778e26a	Joshua Haberman	Updated formatting.	Mar 29 2021, 9:46 AM
57ac0778e26a	b515ddc76ed0	aaab44417969	Joshua Haberman	Implemented [[clang::musttail]] attribute for guaranteed tail calls. (Show More…)	Mar 28 2021, 10:14 PM

Diff 337581

clang/include/clang/AST/IgnoreExpr.h

Show First 20 Lines • Show All 115 Lines • ▼ Show 20 Lines

if (auto *MTE = dyn_cast<MaterializeTemporaryExpr>(E))

return MTE->getSubExpr();

if (auto *BTE = dyn_cast<CXXBindTemporaryExpr>(E))

return BTE->getSubExpr();

return E;

}

inline Expr *IgnoreElidableImplicitConstructorSingleStep(Expr *E) {

auto *CCE = dyn_cast<CXXConstructExpr>(E);

if (CCE && CCE->isElidable() && !isa<CXXTemporaryObjectExpr>(CCE)) {

unsigned NumArgs = CCE->getNumArgs();

aaron.ballmanUnsubmitted

Done

if (CCE && CCE->isElidable() && !isa<CXXTemporaryObjectExpr>(CCE)) {

- auto NumArgs = CCE->getNumArgs();

+ unsigned NumArgs = CCE->getNumArgs();

if ((NumArgs == 1 ||

aaron.ballman:

if ((NumArgs == 1 ||

(NumArgs > 1 && CCE->getArg(1)->isDefaultArgument())) &&

!CCE->getArg(0)->isDefaultArgument() && !CCE->isListInitialization())

return CCE->getArg(0);

}

return E;

}

inline Expr *IgnoreImplicitAsWrittenSingleStep(Expr *E) {

if (auto *ICE = dyn_cast<ImplicitCastExpr>(E))

return ICE->getSubExprAsWritten();

return IgnoreImplicitSingleStep(E);

}

inline Expr *IgnoreParensOnlySingleStep(Expr *E) {

Show All 30 Lines

clang/include/clang/Basic/Attr.td

Show First 20 Lines • Show All 1,364 Lines • ▼ Show 20 Lines	def NoMerge : DeclOrStmtAttr {
let Spellings = [Clang<"nomerge">];		let Spellings = [Clang<"nomerge">];
let Documentation = [NoMergeDocs];		let Documentation = [NoMergeDocs];
let InheritEvenIfAlreadyPresent = 1;		let InheritEvenIfAlreadyPresent = 1;
let Subjects = SubjectList<[Function, Stmt], ErrorDiag,		let Subjects = SubjectList<[Function, Stmt], ErrorDiag,
"functions and statements">;		"functions and statements">;
let SimpleHandler = 1;		let SimpleHandler = 1;
}		}

		def MustTail : StmtAttr {
		let Spellings = [Clang<"musttail">];
		aaron.ballmanUnsubmitted Done Reply Inline Actions You should add a `Subjects` list here. aaron.ballman: You should add a `Subjects` list here.
		let Documentation = [MustTailDocs];
		let Subjects = SubjectList<[ReturnStmt], ErrorDiag, "return statements">;
		}

def FastCall : DeclOrTypeAttr {		def FastCall : DeclOrTypeAttr {
let Spellings = [GCC<"fastcall">, Keyword<"__fastcall">,		let Spellings = [GCC<"fastcall">, Keyword<"__fastcall">,
Keyword<"_fastcall">];		Keyword<"_fastcall">];
// let Subjects = [Function, ObjCMethod];		// let Subjects = [Function, ObjCMethod];
let Documentation = [FastCallDocs];		let Documentation = [FastCallDocs];
}		}

def RegCall : DeclOrTypeAttr {		def RegCall : DeclOrTypeAttr {
▲ Show 20 Lines • Show All 2,394 Lines • Show Last 20 Lines

clang/include/clang/Basic/AttrDocs.td

Show First 20 Lines • Show All 437 Lines • ▼ Show 20 Lines

over the tradeoff between code size and debug information precision. over the tradeoff between code size and debug information precision.

``nomerge`` attribute can also be used as function attribute to prevent all ``nomerge`` attribute can also be used as function attribute to prevent all

calls to the specified function from merging. It has no effect on indirect calls to the specified function from merging. It has no effect on indirect

calls. calls.

}]; }];

} }

def MustTailDocs : Documentation {

let Category = DocCatStmt;

let Content = [{

If a ``return`` statement is marked ``musttail``, this indicates that the

aaron.ballmanUnsubmitted

Done

let Content = [{

- If a return statement is marked ``musttail``, this indicates that the

+ If a ``return`` statement is marked ``musttail``, this indicates that the

compiler must generate a tail call for the program to be correct, even when

aaron.ballman:

compiler must generate a tail call for the program to be correct, even when

optimizations are disabled. This guarantees that the call will not cause

unbounded stack growth if it is part of a recursive cycle in the call graph.

rsmithUnsubmitted

Done

One thing I'd add:

If the callee is a virtual function that is implemented by a thunk, there is no guarantee in general that the thunk tail-calls the implementation of the virtual function, so such a call in a recursive cycle can still result in unbounded stack growth.

rsmith: One thing I'd add: > If the callee is a virtual function that is implemented by a thunk, there…

If the callee is a virtual function that is implemented by a thunk, there is

aaron.ballmanUnsubmitted

Done

unbounded stack growth if it is part of a recursive cycle in the call graph.

- ``clang::musttail`` can only be applied to a return statement whose value is a

+ ``clang::musttail`` can only be applied to a ``return`` statement whose value is the result of a

function call (even functions returning void must use 'return', although no

aaron.ballman:

no guarantee in general that the thunk tail-calls the implementation of the

virtual function, so such a call in a recursive cycle can still result in

unbounded stack growth.

aaron.ballmanUnsubmitted

Done

It'd be nice if we could nail down "similar" somewhat. I don't know if int and short are similar (due to promotions) or const int and int are similar, etc.

aaron.ballman: It'd be nice if we could nail down "similar" somewhat. I don't know if `int` and `short` are…

habermanAuthorUnsubmitted

Done

Done. I tried to summarize the C++ concept of "similar" types as defined in https://eel.is/c++draft/conv.qual#2 and implemented in https://clang.llvm.org/doxygen/classclang_1_1ASTContext.html#a1b1b3b7a67a30fd817ba85454780d8ad

haberman: Done. I tried to summarize the C++ concept of "similar" types as defined in https://eel.

``clang::musttail`` can only be applied to a ``return`` statement whose value

is the result of a function call (even functions returning void must use

``return``, although no value is returned). The target function must have the

same number of arguments as the caller. The types of the return value and all

aaron.ballmanUnsubmitted

Done

Not only is this not usable with K&R C declarations, but it's also not usable with ... variadic functions either, right?

aaron.ballman: Not only is this not usable with K&R C declarations, but it's also not usable with `...`…

arguments must be similar according to C++ rules (differing only in cv

qualifiers or array size), including the implicit "this" argument, if any.

Any variables in scope, including all arguments to the function and the

return value must be trivially destructible. The calling convention of the

caller and callee must match, and they must not be variadic functions or have

old style K&R C function declarations.

}];

}

def AssertCapabilityDocs : Documentation { def AssertCapabilityDocs : Documentation {

let Category = DocCatFunction; let Category = DocCatFunction;

let Heading = "assert_capability, assert_shared_capability"; let Heading = "assert_capability, assert_shared_capability";

let Content = [{ let Content = [{

Marks a function that dynamically tests whether a capability is held, and halts Marks a function that dynamically tests whether a capability is held, and halts

the program if it is not held. the program if it is not held.

}]; }];

} }

▲ Show 20 Lines • Show All 5,448 Lines • Show Last 20 Lines

clang/include/clang/Basic/DiagnosticSemaKinds.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,817 Lines • ▼ Show 20 Lines def warn_auto_var_is_id : Warning<

"'auto' deduced as 'id' in declaration of %0">, "'auto' deduced as 'id' in declaration of %0">,

InGroup<DiagGroup<"auto-var-id">>; InGroup<DiagGroup<"auto-var-id">>;

// Attributes // Attributes

def warn_nomerge_attribute_ignored_in_stmt: Warning< def warn_nomerge_attribute_ignored_in_stmt: Warning<

"%0 attribute is ignored because there exists no call expression inside the " "%0 attribute is ignored because there exists no call expression inside the "

"statement">, "statement">,

InGroup<IgnoredAttributes>; InGroup<IgnoredAttributes>;

def err_musttail_needs_trivial_args : Error<

aaron.ballmanUnsubmitted

Done

This error should not be necessary if you add the correct subject line to Attr.td

aaron.ballman: This error should not be necessary if you add the correct subject line to Attr.td

"tail call requires that the return value, all parameters, and any "

"temporaries created by the expression are trivially destructible">;

rsmithUnsubmitted

Done

Can we somehow avoid talking about ARC where it's not relevant? While it'd be nice to be more precise here, my main concern is that we shouldn't be mentioning ARC to people for whom it's not a meaningful term (eg, when not compiling Objective-C or Objective-C++). Perhaps the simplest approach would be to only mention ARC if getLangOpts().ObjCAutoRefCount is set?

rsmith: Can we somehow avoid talking about ARC where it's not relevant? While it'd be nice to be more…

habermanAuthorUnsubmitted

Done

I implemented this but I couldn't figure out how to actually trigger the ARC case, so I just removed that part of the diagnostic text for now.

haberman: I implemented this but I couldn't figure out how to actually trigger the ARC case, so I just…

def err_musttail_needs_call : Error<

aaron.ballmanUnsubmitted

Done

def err_musttail_needs_call : Error<

- "%0 attribute requires that the return value is a function call, which must "

- "not create or destroy any temporaries.">;

+ "%0 attribute requires that the return value is a function call and must "

+ "not create or destroy any temporaries">;

def err_musttail_only_from_function : Error<

aaron.ballman:

rsmithUnsubmitted

Done

Can we diagnose these two cases separately?

rsmith: Can we diagnose these two cases separately?

"%0 attribute requires that the return value is the result of a function call"

aaron.ballmanUnsubmitted

Done

def err_musttail_only_from_function : Error<

- "%0 attribute can only be used from a regular function.">;

+ "%0 attribute can only be used from a regular function">;

def err_musttail_return_type_mismatch : Error<

What is a "regular function?"

aaron.ballman: What is a "regular function?"

habermanAuthorUnsubmitted

Done

I may have been trying to distinguish between blocks, or lambdas, I can't exactly remember.

I think I still need to add tests for blocks and Obj-C refcounts. I'm going to leave this comment open for now as a reminder to revisit this.

haberman: I may have been trying to distinguish between blocks, or lambdas, I can't exactly remember. I…

def err_musttail_needs_prototype : Error<

"%0 attribute requires that both caller and callee functions have a "

"prototype">;

rsmithUnsubmitted

Done

It would be useful to say what didn't match. Eg, parameter index or parameter name.

rsmith: It would be useful to say what didn't match. Eg, parameter index or parameter name.

def note_musttail_fix_non_prototype : Note<

"add 'void' to the parameter list to turn an old-style K&R function "

aaron.ballmanUnsubmitted

Done

All automatic variables require destruction when leaving the scope, so this diagnostic reads oddly to me. Perhaps you mean that the variables must all be trivially destructible?

aaron.ballman: All automatic variables require destruction when leaving the scope, so this diagnostic reads…

rsmithUnsubmitted

Done

Can we say which variable? I think it'd be useful to have both a diagnostic and a note here, pointing to the attribute and variable.

rsmith: Can we say which variable? I think it'd be useful to have both a diagnostic and a note here…

"declaration into a prototype">;

def err_musttail_structors_forbidden : Error<"cannot perform a tail call "

"%select{from|to}0 a %select{constructor|destructor}1">;

def note_musttail_structors_forbidden : Note<"target "

"%select{constructor|destructor}0 is declared here">;

def err_musttail_forbidden_from_this_context : Error<

"%0 attribute cannot be used from "

"%select{a block|an Objective-C function|this context}1">;

rsmithUnsubmitted

Done

"target function "

- "%select{has different class%diff{ (expected $ but has $)|}1,2"

+ "%select{is member of different class%diff{ (expected $ but has $)|}1,2"

"|has different number of parameters (expected %1 but has %2)"

Might be clearer?

rsmith: Might be clearer?

def err_musttail_member_mismatch : Error<

"%select{non-member|static member|non-static member}0 "

"function cannot perform a tail call to "

"%select{non-member|static member|non-static member|pointer-to-member}1 "

"function%select{| %3}2">;

def note_musttail_callee_defined_here : Note<"%0 declared here">;

def note_tail_call_required : Note<"tail call required by %0 attribute here">;

def err_musttail_mismatch : Error<

"cannot perform a tail call to function%select{| %1}0 because its signature "

"is incompatible with the calling function">;

def note_musttail_mismatch : Note<

"target function "

"%select{is a member of different class%diff{ (expected $ but has $)|}1,2"

"|has different number of parameters (expected %1 but has %2)"

"|has type mismatch at %ordinal3 parameter"

"%diff{ (expected $ but has $)|}1,2"

"|has different return type%diff{ ($ expected but has $)|}1,2}0">;

def err_musttail_callconv_mismatch : Error<

"cannot perform a tail call to function%select{| %1}0 because it uses an "

"incompatible calling convention">;

def note_musttail_callconv_mismatch : Note<

"target function has calling convention %1 (expected %0)">;

def err_musttail_scope : Error<

"cannot perform a tail call from this return statement">;

def err_musttail_no_variadic : Error<

"%0 attribute may not be used with variadic functions">;

def err_nsobject_attribute : Error< def err_nsobject_attribute : Error<

"'NSObject' attribute is for pointer types only">; "'NSObject' attribute is for pointer types only">;

def err_attributes_are_not_compatible : Error< def err_attributes_are_not_compatible : Error<

"%0 and %1 attributes are not compatible">; "%0 and %1 attributes are not compatible">;

def err_attribute_invalid_argument : Error< def err_attribute_invalid_argument : Error<

"%select{a reference type|an array type|a non-vector or " "%select{a reference type|an array type|a non-vector or "

"non-vectorizable scalar type}0 is an invalid argument to attribute %1">; "non-vectorizable scalar type}0 is an invalid argument to attribute %1">;

def err_attribute_wrong_number_arguments : Error< def err_attribute_wrong_number_arguments : Error<

▲ Show 20 Lines • Show All 8,371 Lines • Show Last 20 Lines

clang/include/clang/Sema/ScopeInfo.h

Show First 20 Lines • Show All 112 Lines • ▼ Show 20 Lines

public:

bool HasBranchProtectedScope : 1;

/// Whether this function contains any switches or direct gotos.

bool HasBranchIntoScope : 1;

/// Whether this function contains any indirect gotos.

bool HasIndirectGoto : 1;

/// Whether this function contains any statement marked with

aaron.ballmanUnsubmitted

Done

bool HasIndirectGoto : 1;

- /// Whether this function contains any statement marked with \c [musttail].

+ /// Whether this function contains any statement marked with \c [[clang::musttail]].

bool HasMustTail : 1;

aaron.ballman:

/// \c [[clang::musttail]].

bool HasMustTail : 1;

/// Whether a statement was dropped because it was invalid.

bool HasDroppedStmt : 1;

/// True if current scope is for OpenMP declare reduction combiner.

bool HasOMPDeclareReductionCombiner : 1;

/// Whether there is a fallthrough statement in this function.

bool HasFallthroughStmt : 1;

▲ Show 20 Lines • Show All 236 Lines • ▼ Show 20 Lines

private:

WeakObjectUseMap WeakObjectUses;

protected:

FunctionScopeInfo(const FunctionScopeInfo&) = default;

public:

FunctionScopeInfo(DiagnosticsEngine &Diag)

: Kind(SK_Function), HasBranchProtectedScope(false),

HasBranchIntoScope(false), HasIndirectGoto(false),

HasBranchIntoScope(false), HasIndirectGoto(false), HasMustTail(false),

HasDroppedStmt(false), HasOMPDeclareReductionCombiner(false),

HasFallthroughStmt(false), UsesFPIntrin(false),

HasPotentialAvailabilityViolations(false),

HasPotentialAvailabilityViolations(false), ObjCShouldCallSuper(false),

ObjCShouldCallSuper(false), ObjCIsDesignatedInit(false),

ObjCIsDesignatedInit(false), ObjCWarnForNoDesignatedInitChain(false),

ObjCWarnForNoDesignatedInitChain(false), ObjCIsSecondaryInit(false),

ObjCIsSecondaryInit(false), ObjCWarnForNoInitDelegation(false),

ObjCWarnForNoInitDelegation(false), NeedsCoroutineSuspends(true),

NeedsCoroutineSuspends(true), ErrorTrap(Diag) {}

ErrorTrap(Diag) {}

virtual ~FunctionScopeInfo();

/// Determine whether an unrecoverable error has occurred within this

/// function. Note that this may return false even if the function body is

/// invalid, because the errors may be suppressed if they're caused by prior

/// invalid declarations.

///

Show All 29 Lines

public:

void setHasBranchProtectedScope() {

HasBranchProtectedScope = true;

}

void setHasIndirectGoto() {

HasIndirectGoto = true;

}

void setHasMustTail() { HasMustTail = true; }

void setHasDroppedStmt() {

HasDroppedStmt = true;

}

void setHasOMPDeclareReductionCombiner() {

HasOMPDeclareReductionCombiner = true;

}

Show All 11 Lines

public:

}

void setHasSEHTry(SourceLocation TryLoc) {

setHasBranchProtectedScope();

FirstSEHTryLoc = TryLoc;

}

bool NeedsScopeChecking() const {

return !HasDroppedStmt &&

return !HasDroppedStmt && (HasIndirectGoto || HasMustTail ||

(HasIndirectGoto ||

(HasBranchProtectedScope && HasBranchIntoScope));

}

// Add a block introduced in this function.

void addBlock(const BlockDecl *BD) {

Blocks.insert(BD);

}

// Add a __block variable introduced in this function.

▲ Show 20 Lines • Show All 589 Lines • Show Last 20 Lines

clang/include/clang/Sema/Sema.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,842 Lines • ▼ Show 20 Lines	sema::FunctionScopeInfo *getCurFunction() const {
return FunctionScopes.empty() ? nullptr : FunctionScopes.back();		return FunctionScopes.empty() ? nullptr : FunctionScopes.back();
}		}

sema::FunctionScopeInfo *getEnclosingFunction() const;		sema::FunctionScopeInfo *getEnclosingFunction() const;

void setFunctionHasBranchIntoScope();		void setFunctionHasBranchIntoScope();
void setFunctionHasBranchProtectedScope();		void setFunctionHasBranchProtectedScope();
void setFunctionHasIndirectGoto();		void setFunctionHasIndirectGoto();
		void setFunctionHasMustTail();

void PushCompoundScope(bool IsStmtExpr);		void PushCompoundScope(bool IsStmtExpr);
void PopCompoundScope();		void PopCompoundScope();

sema::CompoundScopeInfo &getCurCompoundScope() const;		sema::CompoundScopeInfo &getCurCompoundScope() const;

bool hasAnyUnrecoverableErrorsInThisFunction() const;		bool hasAnyUnrecoverableErrorsInThisFunction() const;

▲ Show 20 Lines • Show All 9,492 Lines • ▼ Show 20 Lines

// Determines which VarArgKind fits an expression.		// Determines which VarArgKind fits an expression.
VarArgKind isValidVarArgType(const QualType &Ty);		VarArgKind isValidVarArgType(const QualType &Ty);

/// Check to see if the given expression is a valid argument to a variadic		/// Check to see if the given expression is a valid argument to a variadic
/// function, issuing a diagnostic if not.		/// function, issuing a diagnostic if not.
void checkVariadicArgument(const Expr *E, VariadicCallType CT);		void checkVariadicArgument(const Expr *E, VariadicCallType CT);

		/// Check whether the given statement can have musttail applied to it,
		/// issuing a diagnostic and returning false if not. In the success case,
		/// the statement is rewritten to remove implicit nodes from the return
		/// value.
		bool checkAndRewriteMustTailAttr(Stmt *St, const Attr &MTA);

		private:
		/// Check whether the given statement can have musttail applied to it,
		/// issuing a diagnostic and returning false if not.
		bool checkMustTailAttr(const Stmt *St, const Attr &MTA);

		public:
/// Check to see if a given expression could have '.c_str()' called on it.		/// Check to see if a given expression could have '.c_str()' called on it.
bool hasCStrMethod(const Expr *E);		bool hasCStrMethod(const Expr *E);

/// GatherArgumentsForCall - Collector argument expressions for various		/// GatherArgumentsForCall - Collector argument expressions for various
/// form of call prototypes.		/// form of call prototypes.
bool GatherArgumentsForCall(SourceLocation CallLoc, FunctionDecl *FDecl,		bool GatherArgumentsForCall(SourceLocation CallLoc, FunctionDecl *FDecl,
const FunctionProtoType *Proto,		const FunctionProtoType *Proto,
unsigned FirstParam, ArrayRef<Expr *> Args,		unsigned FirstParam, ArrayRef<Expr *> Args,
▲ Show 20 Lines • Show All 1,622 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGCall.cpp

Show First 20 Lines • Show All 4,553 Lines • ▼ Show 20 Lines

}; };

} // namespace } // namespace

RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,

const CGCallee &Callee, const CGCallee &Callee,

ReturnValueSlot ReturnValue, ReturnValueSlot ReturnValue,

const CallArgList &CallArgs, const CallArgList &CallArgs,

llvm::CallBase **callOrInvoke, llvm::CallBase **callOrInvoke, bool IsMustTail,

SourceLocation Loc) { SourceLocation Loc) {

// FIXME: We no longer need the types from CallArgs; lift up and simplify. // FIXME: We no longer need the types from CallArgs; lift up and simplify.

assert(Callee.isOrdinary() || Callee.isVirtual()); assert(Callee.isOrdinary() || Callee.isVirtual());

// Handle struct-return functions by passing a pointer to the // Handle struct-return functions by passing a pointer to the

// location that we would like to return into. // location that we would like to return into.

QualType RetTy = CallInfo.getReturnType(); QualType RetTy = CallInfo.getReturnType();

▲ Show 20 Lines • Show All 676 Lines • ▼ Show 20 Lines if (!CI->getCalledFunction())

PGO.valueProfile(Builder, llvm::IPVK_IndirectCallTarget, PGO.valueProfile(Builder, llvm::IPVK_IndirectCallTarget,

CI, CalleePtr); CI, CalleePtr);

// In ObjC ARC mode with no ObjC ARC exception safety, tell the ARC // In ObjC ARC mode with no ObjC ARC exception safety, tell the ARC

// optimizer it can aggressively ignore unwind edges. // optimizer it can aggressively ignore unwind edges.

if (CGM.getLangOpts().ObjCAutoRefCount) if (CGM.getLangOpts().ObjCAutoRefCount)

AddObjCARCExceptionMetadata(CI); AddObjCARCExceptionMetadata(CI);

// Suppress tail calls if requested. // Set tail call kind if necessary.

if (llvm::CallInst *Call = dyn_cast<llvm::CallInst>(CI)) { if (llvm::CallInst *Call = dyn_cast<llvm::CallInst>(CI)) {

if (TargetDecl && TargetDecl->hasAttr<NotTailCalledAttr>()) if (TargetDecl && TargetDecl->hasAttr<NotTailCalledAttr>())

Call->setTailCallKind(llvm::CallInst::TCK_NoTail); Call->setTailCallKind(llvm::CallInst::TCK_NoTail);

else if (IsMustTail)

Call->setTailCallKind(llvm::CallInst::TCK_MustTail);

} }

// Add metadata for calls to MSAllocator functions // Add metadata for calls to MSAllocator functions

if (getDebugInfo() && TargetDecl && if (getDebugInfo() && TargetDecl &&

TargetDecl->hasAttr<MSAllocatorAttr>()) TargetDecl->hasAttr<MSAllocatorAttr>())

getDebugInfo()->addHeapAllocSiteMetadata(CI, RetTy->getPointeeType(), Loc); getDebugInfo()->addHeapAllocSiteMetadata(CI, RetTy->getPointeeType(), Loc);

// 4. Finish the call. // 4. Finish the call.

Show All 35 Lines if (CI->doesNotReturn()) {

// generally are not ready to handle emitting expressions at unreachable // generally are not ready to handle emitting expressions at unreachable

// points. // points.

EnsureInsertPoint(); EnsureInsertPoint();

// Return a reasonable RValue. // Return a reasonable RValue.

return GetUndefRValue(RetTy); return GetUndefRValue(RetTy);

} }

// If this is a musttail call, return immediately. We do not branch to the

// epilogue in this case.

rsmithUnsubmitted

Done

epilogue?

rsmith: epilogue?

if (IsMustTail) {

for (auto it = EHStack.find(CurrentCleanupScopeDepth); it != EHStack.end();

++it) {

EHCleanupScope *Cleanup = dyn_cast<EHCleanupScope>(&*it);

aaron.ballmanUnsubmitted

Done

Are you planning to handle this TODO in the patch? If not, can you switch to a FIXME without a name associated with it?

aaron.ballman: Are you planning to handle this TODO in the patch? If not, can you switch to a FIXME without a…

habermanAuthorUnsubmitted

Done

I am interested in feedback on the best way to proceed here.

Is my assessment correct that we should have an assertion that validates this?
Is such an assertion reasonably feasible to implement?
Is it ok to defer with FIXME, or should I try to fix it in this patch?

I've changed it to a FIXME for now.

haberman: I am interested in feedback on the best way to proceed here. - Is my assessment correct that…

rsmithUnsubmitted

Done

Yes, I think we should validate this by an assertion if we can. We can check this by walking the cleanup scope stack (walk from CurrentCleanupScopeDepth to EHScopeStack::stable_end()) and making sure that there is no "problematic" enclosing cleanup scope. Here, "problematic" would mean any scope other than an EHCleanupScope containing only CallLifetimeEnd cleanups.

Looking at the kinds of cleanups that we might encounter here, I think there may be a few more things that Sema needs to check in order to not get in the way of exception handling. In particular, I think we should reject if the callee is potentially-throwing and the musttail call is inside a try block or a function that's either noexcept or has a dynamic exception specification.

Oh, also, we should disallow musttail calls inside statement expressions, in order to defend against cleanups that exist transiently within an expression.

rsmith: Yes, I think we should validate this by an assertion if we can. We can check this by walking…

habermanAuthorUnsubmitted

Done

I'm having trouble implementing the check because there doesn't appear to be any discriminator in EHScopeStack::Cleanup that will let you test if it is a CallLifetimeEnd. (The actual code just does virtual dispatch through EHScopeStack::Cleanup::Emit().

I temporarily implemented this by adding an extra virtual function to act as discriminator. The check fires if a VLA is in scope:

int Func14(int x) {
  int vla[x];
  [[clang::musttail]] return Bar(x);
}

Do we need to forbid VLAs or do I need to refine the check?

It appears that JumpDiagnostics.cpp is already diagnosing statement expressions and try. However I could not get testing to work. I tried adding a test with try but even with -fexceptions I am getting:

cannot use 'try' with exceptions disabled

haberman: I'm having trouble implementing the check because there doesn't appear to be any discriminator…

rsmithUnsubmitted

Done

Do we need to forbid VLAs or do I need to refine the check?

Assuming that LLVM supports musttail calls from functions where a dynamic alloca is in scope, I think we should allow VLAs. The musttail documentation doesn't mention this, so I think its OK, and I can't think of a good reason why you wouldn't be able to musttail call due to a variably-sized frame.

Perhaps a good model would be to add a virtual function to permit asking a cleanup whether it's optional / skippable.

I could not get testing to work.

You need -fcxx-exceptions to use try. At the -cc1 level, we have essentially-orthogonal settings for "it's valid for exceptions to unwind through this code" (-fexceptions) and "C++ exception handling syntax is permitted" (-fcxx-exceptions), and you usually need to enable both for CodeGen tests involving exceptions.

rsmith: > Do we need to forbid VLAs or do I need to refine the check? Assuming that LLVM supports…

rsmithUnsubmitted

Done

Or maybe instead of "is optional / skippable", the right question is, "is this redundant if we're about to return?" That way we could potentially one day reuse the same mechanism to also skip emitting such cleanups when emitting a cleanup path into the return block.

rsmith: Or maybe instead of "is optional / skippable", the right question is, "is this redundant if…

if (!(Cleanup && Cleanup->getCleanup()->isRedundantBeforeReturn()))

CGM.ErrorUnsupported(MustTailCall, "tail call skipping over cleanups");

rsmithUnsubmitted

Done

Given the potential for mismatch between the JumpDiagnostics checks and this one, especially as new more exotic kinds of cleanup are added, I wonder if we should use an ErrorUnsupported here instead of an assert.

I strongly suspect we can still reach the problematic case here for a tail call in a statement expression. I don't think it's feasible to check for all the ways that an arbitrary expression context can have pending cleanups, which we'd need in order to produce precise Sema diagnostics for that, so either we handle that here or we blanket reject all musttail returns in statement expressions. I think either approach is probably acceptable.

rsmith: Given the potential for mismatch between the JumpDiagnostics checks and this one, especially as…

rjmccallUnsubmitted

Done

Yes, I think ErrorUnsupported is a much better idea.

rjmccall: Yes, I think ErrorUnsupported is a much better idea.

chfastUnsubmitted

Not Done

I reported a related issue. I wander if this is easy to fix. https://github.com/llvm/llvm-project/issues/53087.

chfast: I reported a related issue. I wander if this is easy to fix. https://github.com/llvm/llvm…

}

if (CI->getType()->isVoidTy())

Builder.CreateRetVoid();

aaron.ballmanUnsubmitted

Done

// here.

- if (RetTy->isVoidType()) {

+ if (RetTy->isVoidType())

Builder.CreateRetVoid();

- } else {

+ else

Builder.CreateRet(CI);

- }

Builder.ClearInsertionPoint();

aaron.ballman:

else

Builder.CreateRet(CI);

Builder.ClearInsertionPoint();

EnsureInsertPoint();

return GetUndefRValue(RetTy);

}

// Perform the swifterror writeback. // Perform the swifterror writeback.

if (swiftErrorTemp.isValid()) { if (swiftErrorTemp.isValid()) {

llvm::Value *errorResult = Builder.CreateLoad(swiftErrorTemp); llvm::Value *errorResult = Builder.CreateLoad(swiftErrorTemp);

Builder.CreateStore(errorResult, swiftErrorArg); Builder.CreateStore(errorResult, swiftErrorArg);

} }

// Emit any call-associated writebacks immediately. Arguably this // Emit any call-associated writebacks immediately. Arguably this

// should happen after any return-value munging. // should happen after any return-value munging.

▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGClass.cpp

Show First 20 Lines • Show All 2,176 Lines • ▼ Show 20 Lines	CGCXXABI::AddedStructorArgCounts ExtraArgs =
CGM.getCXXABI().addImplicitConstructorArgs(*this, D, Type, ForVirtualBase,		CGM.getCXXABI().addImplicitConstructorArgs(*this, D, Type, ForVirtualBase,
Delegating, Args);		Delegating, Args);

// Emit the call.		// Emit the call.
llvm::Constant *CalleePtr = CGM.getAddrOfCXXStructor(GlobalDecl(D, Type));		llvm::Constant *CalleePtr = CGM.getAddrOfCXXStructor(GlobalDecl(D, Type));
const CGFunctionInfo &Info = CGM.getTypes().arrangeCXXConstructorCall(		const CGFunctionInfo &Info = CGM.getTypes().arrangeCXXConstructorCall(
Args, D, Type, ExtraArgs.Prefix, ExtraArgs.Suffix, PassPrototypeArgs);		Args, D, Type, ExtraArgs.Prefix, ExtraArgs.Suffix, PassPrototypeArgs);
CGCallee Callee = CGCallee::forDirect(CalleePtr, GlobalDecl(D, Type));		CGCallee Callee = CGCallee::forDirect(CalleePtr, GlobalDecl(D, Type));
EmitCall(Info, Callee, ReturnValueSlot(), Args, nullptr, Loc);		EmitCall(Info, Callee, ReturnValueSlot(), Args, nullptr, false, Loc);

// Generate vtable assumptions if we're constructing a complete object		// Generate vtable assumptions if we're constructing a complete object
// with a vtable. We don't do this for base subobjects for two reasons:		// with a vtable. We don't do this for base subobjects for two reasons:
// first, it's incorrect for classes with virtual bases, and second, we're		// first, it's incorrect for classes with virtual bases, and second, we're
// about to overwrite the vptrs anyway.		// about to overwrite the vptrs anyway.
// We also have to make sure if we can refer to vtable:		// We also have to make sure if we can refer to vtable:
// - Otherwise we can refer to vtable if it's safe to speculatively emit.		// - Otherwise we can refer to vtable if it's safe to speculatively emit.
// FIXME: If vtable is used by ctor/dtor, or if vtable is external and we are		// FIXME: If vtable is used by ctor/dtor, or if vtable is external and we are
▲ Show 20 Lines • Show All 782 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGDecl.cpp

Show First 20 Lines • Show All 544 Lines • ▼ Show 20 Lines	struct DestroyNRVOVariableC final
void emitDestructorCall(CodeGenFunction &CGF) {		void emitDestructorCall(CodeGenFunction &CGF) {
CGF.destroyNonTrivialCStruct(CGF, Loc, Ty);		CGF.destroyNonTrivialCStruct(CGF, Loc, Ty);
}		}
};		};

struct CallStackRestore final : EHScopeStack::Cleanup {		struct CallStackRestore final : EHScopeStack::Cleanup {
Address Stack;		Address Stack;
CallStackRestore(Address Stack) : Stack(Stack) {}		CallStackRestore(Address Stack) : Stack(Stack) {}
		bool isRedundantBeforeReturn() override { return true; }
void Emit(CodeGenFunction &CGF, Flags flags) override {		void Emit(CodeGenFunction &CGF, Flags flags) override {
llvm::Value *V = CGF.Builder.CreateLoad(Stack);		llvm::Value *V = CGF.Builder.CreateLoad(Stack);
llvm::Function *F = CGF.CGM.getIntrinsic(llvm::Intrinsic::stackrestore);		llvm::Function *F = CGF.CGM.getIntrinsic(llvm::Intrinsic::stackrestore);
CGF.Builder.CreateCall(F, V);		CGF.Builder.CreateCall(F, V);
}		}
};		};

struct ExtendGCLifetime final : EHScopeStack::Cleanup {		struct ExtendGCLifetime final : EHScopeStack::Cleanup {
▲ Show 20 Lines • Show All 2,053 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGExpr.cpp

Show All 32 Lines

#include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringExtras.h"

#include "llvm/IR/DataLayout.h" #include "llvm/IR/DataLayout.h"

#include "llvm/IR/Intrinsics.h" #include "llvm/IR/Intrinsics.h"

#include "llvm/IR/LLVMContext.h" #include "llvm/IR/LLVMContext.h"

#include "llvm/IR/MDBuilder.h" #include "llvm/IR/MDBuilder.h"

#include "llvm/Support/ConvertUTF.h" #include "llvm/Support/ConvertUTF.h"

#include "llvm/Support/MathExtras.h" #include "llvm/Support/MathExtras.h"

#include "llvm/Support/Path.h" #include "llvm/Support/Path.h"

#include "llvm/Support/SaveAndRestore.h"

#include "llvm/Transforms/Utils/SanitizerStats.h" #include "llvm/Transforms/Utils/SanitizerStats.h"

#include <string> #include <string>

using namespace clang; using namespace clang;

using namespace CodeGen; using namespace CodeGen;

//===--------------------------------------------------------------------===// //===--------------------------------------------------------------------===//

▲ Show 20 Lines • Show All 4,771 Lines • ▼ Show 20 Lines

} }

//===--------------------------------------------------------------------===// //===--------------------------------------------------------------------===//

// Expression Emission // Expression Emission

//===--------------------------------------------------------------------===// //===--------------------------------------------------------------------===//

RValue CodeGenFunction::EmitCallExpr(const CallExpr *E, RValue CodeGenFunction::EmitCallExpr(const CallExpr *E,

ReturnValueSlot ReturnValue) { ReturnValueSlot ReturnValue) {

// Builtins never have block type. // Builtins never have block type.

aaron.ballmanUnsubmitted

Done

ReturnValueSlot ReturnValue) {

- SaveAndRestore<bool> save_musttail(InMustTailCallExpr, E == MustTailCall);

+ SaveAndRestore<bool> SaveMustTail(InMustTailCallExpr, E == MustTailCall);

// Builtins never have block type.

aaron.ballman:

rsmithUnsubmitted

Done

The more I think about this, the more it makes me nervous: if any of the Emit*CallExpr functions below incidentally emit a call on the way to producing their results via the CGCall machinery, and do so without recursing through this function, that incidental call will be emitted as a tail call instead of the intended one. Specifically:

I could imagine a block call involving multiple function calls, depending on the blocks ABI.
I could imagine a member call performing a function call to convert from derived to virtual base in some ABIs.
A CUDA kernel call in general involves calling a setup function before the actual function call happens (and it doesn't make sense for a CUDA kernel call to be a tail call anyway...)
A call to a builtin can result in any number of function calls.
If any expression in the function arguments emits a call without calling back into this function, we'll emit that call as a tail call instead of this one. Eg, [[clang::musttail]] return f(dynamic_cast<T*>(p)); might emit the call to __cxa_dynamic_cast as the tail call instead of emitting the call to f as the tail call, depending on whether the CGCall machinery is used when emitting the __cxa_dynamic_cast call.

Is it feasible to sink this check into the CodeGenFunction::EmitCall overload that takes a CallExpr, CodeGenFunction::EmitCXXMemberOrOperatorCall, and CodeGenFunction::EmitCXXMemberPointerCallExpr, after we've emitted the callee and call args? It looks like we might be able to check this immediately before calling the CGCall overload of EmitCall, so we could pass in the 'musttail' information as a flag or similar instead of using global state in the CodeGenFunction object; if so, it'd be much easier to be confident that we're applying the attribute to the right call.

rsmith: The more I think about this, the more it makes me nervous: if any of the `Emit*CallExpr`…

habermanAuthorUnsubmitted

Done

Done. It's feeling like IsMustTail, callOrInvoke, and Loc might want to get collapsed into an options struct, especially given the default parameters on the first two. Maybe could do as a follow up?

haberman: Done. It's feeling like `IsMustTail`, `callOrInvoke`, and `Loc` might want to get collapsed…

rsmithUnsubmitted

Done

I agree, that sounds like a nice cleanup. Delaying this to a future change makes sense to me.

rsmith: I agree, that sounds like a nice cleanup. Delaying this to a future change makes sense to me.

if (E->getCallee()->getType()->isBlockPointerType()) if (E->getCallee()->getType()->isBlockPointerType())

return EmitBlockCallExpr(E, ReturnValue); return EmitBlockCallExpr(E, ReturnValue);

if (const auto *CE = dyn_cast<CXXMemberCallExpr>(E)) if (const auto *CE = dyn_cast<CXXMemberCallExpr>(E))

return EmitCXXMemberCallExpr(CE, ReturnValue); return EmitCXXMemberCallExpr(CE, ReturnValue);

if (const auto *CE = dyn_cast<CUDAKernelCallExpr>(E)) if (const auto *CE = dyn_cast<CUDAKernelCallExpr>(E))

return EmitCUDAKernelCallExpr(CE, ReturnValue); return EmitCUDAKernelCallExpr(CE, ReturnValue);

▲ Show 20 Lines • Show All 445 Lines • ▼ Show 20 Lines if (CGM.getLangOpts().HIP && !CGM.getLangOpts().CUDAIsDevice &&

llvm::Value *Handle = Callee.getFunctionPointer(); llvm::Value *Handle = Callee.getFunctionPointer();

auto *Cast = auto *Cast =

Builder.CreateBitCast(Handle, Handle->getType()->getPointerTo()); Builder.CreateBitCast(Handle, Handle->getType()->getPointerTo());

auto *Stub = Builder.CreateLoad(Address(Cast, CGM.getPointerAlign())); auto *Stub = Builder.CreateLoad(Address(Cast, CGM.getPointerAlign()));

Callee.setFunctionPointer(Stub); Callee.setFunctionPointer(Stub);

} }

llvm::CallBase *CallOrInvoke = nullptr; llvm::CallBase *CallOrInvoke = nullptr;

RValue Call = EmitCall(FnInfo, Callee, ReturnValue, Args, &CallOrInvoke, RValue Call = EmitCall(FnInfo, Callee, ReturnValue, Args, &CallOrInvoke,

E->getExprLoc()); E == MustTailCall, E->getExprLoc());

// Generate function declaration DISuprogram in order to be used // Generate function declaration DISuprogram in order to be used

// in debug info about call sites. // in debug info about call sites.

if (CGDebugInfo *DI = getDebugInfo()) { if (CGDebugInfo *DI = getDebugInfo()) {

if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl)) if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl))

DI->EmitFuncDeclForCallSite(CallOrInvoke, QualType(FnType, 0), DI->EmitFuncDeclForCallSite(CallOrInvoke, QualType(FnType, 0),

CalleeDecl); CalleeDecl);

} }

▲ Show 20 Lines • Show All 140 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGExprCXX.cpp

Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	RValue CodeGenFunction::EmitCXXMemberOrOperatorCall(
const CallExpr CE, CallArgList RtlArgs) {		const CallExpr CE, CallArgList RtlArgs) {
const FunctionProtoType *FPT = MD->getType()->castAs<FunctionProtoType>();		const FunctionProtoType *FPT = MD->getType()->castAs<FunctionProtoType>();
CallArgList Args;		CallArgList Args;
MemberCallInfo CallInfo = commonEmitCXXMemberOrOperatorCall(		MemberCallInfo CallInfo = commonEmitCXXMemberOrOperatorCall(
*this, MD, This, ImplicitParam, ImplicitParamTy, CE, Args, RtlArgs);		*this, MD, This, ImplicitParam, ImplicitParamTy, CE, Args, RtlArgs);
auto &FnInfo = CGM.getTypes().arrangeCXXMethodCall(		auto &FnInfo = CGM.getTypes().arrangeCXXMethodCall(
Args, FPT, CallInfo.ReqArgs, CallInfo.PrefixSize);		Args, FPT, CallInfo.ReqArgs, CallInfo.PrefixSize);
return EmitCall(FnInfo, Callee, ReturnValue, Args, nullptr,		return EmitCall(FnInfo, Callee, ReturnValue, Args, nullptr,
		CE && CE == MustTailCall,
CE ? CE->getExprLoc() : SourceLocation());		CE ? CE->getExprLoc() : SourceLocation());
}		}

RValue CodeGenFunction::EmitCXXDestructorCall(		RValue CodeGenFunction::EmitCXXDestructorCall(
GlobalDecl Dtor, const CGCallee &Callee, llvm::Value *This, QualType ThisTy,		GlobalDecl Dtor, const CGCallee &Callee, llvm::Value *This, QualType ThisTy,
llvm::Value ImplicitParam, QualType ImplicitParamTy, const CallExpr CE) {		llvm::Value ImplicitParam, QualType ImplicitParamTy, const CallExpr CE) {
const CXXMethodDecl *DtorDecl = cast<CXXMethodDecl>(Dtor.getDecl());		const CXXMethodDecl *DtorDecl = cast<CXXMethodDecl>(Dtor.getDecl());

Show All 9 Lines	if (SrcAS != DstAS) {
This = getTargetHooks().performAddrSpaceCast(*this, This, SrcAS, DstAS,		This = getTargetHooks().performAddrSpaceCast(*this, This, SrcAS, DstAS,
NewType);		NewType);
}		}

CallArgList Args;		CallArgList Args;
commonEmitCXXMemberOrOperatorCall(*this, DtorDecl, This, ImplicitParam,		commonEmitCXXMemberOrOperatorCall(*this, DtorDecl, This, ImplicitParam,
ImplicitParamTy, CE, Args, nullptr);		ImplicitParamTy, CE, Args, nullptr);
return EmitCall(CGM.getTypes().arrangeCXXStructorDeclaration(Dtor), Callee,		return EmitCall(CGM.getTypes().arrangeCXXStructorDeclaration(Dtor), Callee,
ReturnValueSlot(), Args, nullptr,		ReturnValueSlot(), Args, nullptr, CE && CE == MustTailCall,
CE ? CE->getExprLoc() : SourceLocation{});		CE ? CE->getExprLoc() : SourceLocation{});
}		}

RValue CodeGenFunction::EmitCXXPseudoDestructorExpr(		RValue CodeGenFunction::EmitCXXPseudoDestructorExpr(
const CXXPseudoDestructorExpr *E) {		const CXXPseudoDestructorExpr *E) {
QualType DestroyedType = E->getDestroyedType();		QualType DestroyedType = E->getDestroyedType();
if (DestroyedType.hasStrongOrWeakObjCLifetime()) {		if (DestroyedType.hasStrongOrWeakObjCLifetime()) {
// Automatic Reference Counting:		// Automatic Reference Counting:
▲ Show 20 Lines • Show All 343 Lines • ▼ Show 20 Lines	CodeGenFunction::EmitCXXMemberPointerCallExpr(const CXXMemberCallExpr *E,
Args.add(RValue::get(ThisPtrForCall), ThisType);		Args.add(RValue::get(ThisPtrForCall), ThisType);

RequiredArgs required = RequiredArgs::forPrototypePlus(FPT, 1);		RequiredArgs required = RequiredArgs::forPrototypePlus(FPT, 1);

// And the rest of the call args		// And the rest of the call args
EmitCallArgs(Args, FPT, E->arguments());		EmitCallArgs(Args, FPT, E->arguments());
return EmitCall(CGM.getTypes().arrangeCXXMethodCall(Args, FPT, required,		return EmitCall(CGM.getTypes().arrangeCXXMethodCall(Args, FPT, required,
/PrefixSize=/0),		/PrefixSize=/0),
Callee, ReturnValue, Args, nullptr, E->getExprLoc());		Callee, ReturnValue, Args, nullptr, E == MustTailCall,
		E->getExprLoc());
}		}

RValue		RValue
CodeGenFunction::EmitCXXOperatorMemberCallExpr(const CXXOperatorCallExpr *E,		CodeGenFunction::EmitCXXOperatorMemberCallExpr(const CXXOperatorCallExpr *E,
const CXXMethodDecl *MD,		const CXXMethodDecl *MD,
ReturnValueSlot ReturnValue) {		ReturnValueSlot ReturnValue) {
assert(MD->isInstance() &&		assert(MD->isInstance() &&
"Trying to emit a member call expr on a static method!");		"Trying to emit a member call expr on a static method!");
▲ Show 20 Lines • Show All 1,838 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGStmt.cpp

Show All 10 Lines

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#include "CGDebugInfo.h" #include "CGDebugInfo.h"

#include "CGOpenMPRuntime.h" #include "CGOpenMPRuntime.h"

#include "CodeGenFunction.h" #include "CodeGenFunction.h"

#include "CodeGenModule.h" #include "CodeGenModule.h"

#include "TargetInfo.h" #include "TargetInfo.h"

#include "clang/AST/Attr.h" #include "clang/AST/Attr.h"

#include "clang/AST/Expr.h"

#include "clang/AST/Stmt.h"

#include "clang/AST/StmtVisitor.h" #include "clang/AST/StmtVisitor.h"

#include "clang/Basic/Builtins.h" #include "clang/Basic/Builtins.h"

#include "clang/Basic/DiagnosticSema.h" #include "clang/Basic/DiagnosticSema.h"

#include "clang/Basic/PrettyStackTrace.h" #include "clang/Basic/PrettyStackTrace.h"

#include "clang/Basic/SourceManager.h" #include "clang/Basic/SourceManager.h"

#include "clang/Basic/TargetInfo.h" #include "clang/Basic/TargetInfo.h"

#include "llvm/ADT/SmallSet.h" #include "llvm/ADT/SmallSet.h"

#include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringExtras.h"

▲ Show 20 Lines • Show All 617 Lines • ▼ Show 20 Lines

void CodeGenFunction::EmitLabelStmt(const LabelStmt &S) { void CodeGenFunction::EmitLabelStmt(const LabelStmt &S) {

EmitLabel(S.getDecl()); EmitLabel(S.getDecl());

EmitStmt(S.getSubStmt()); EmitStmt(S.getSubStmt());

} }

void CodeGenFunction::EmitAttributedStmt(const AttributedStmt &S) { void CodeGenFunction::EmitAttributedStmt(const AttributedStmt &S) {

bool nomerge = false; bool nomerge = false;

for (const auto *A : S.getAttrs()) const CallExpr *musttail = nullptr;

for (const auto *A : S.getAttrs()) {

if (A->getKind() == attr::NoMerge) { if (A->getKind() == attr::NoMerge) {

nomerge = true; nomerge = true;

break; }

if (A->getKind() == attr::MustTail) {

const Stmt *Sub = S.getSubStmt();

const ReturnStmt *R = cast<ReturnStmt>(Sub);

musttail = cast<CallExpr>(R->getRetValue()->IgnoreParens());

}

} }

aaron.ballmanUnsubmitted

Done

const Stmt *Sub = S.getSubStmt();

- const ReturnStmt *R = dyn_cast<ReturnStmt>(Sub);

- assert(R && "musttail should only be on ReturnStmt");

- musttail = dyn_cast<CallExpr>(R->getRetValue());

- assert(musttail && "musttail must return CallExpr");

+ const ReturnStmt *R = cast<ReturnStmt>(Sub);

+ musttail = cast<CallExpr>(R->getRetValue());

}

SaveAndRestore<bool> save_nomerge(InNoMergeAttributedStmt, nomerge);

aaron.ballman:

SaveAndRestore<bool> save_nomerge(InNoMergeAttributedStmt, nomerge); SaveAndRestore<bool> save_nomerge(InNoMergeAttributedStmt, nomerge);

SaveAndRestore<const CallExpr *> save_musttail(MustTailCall, musttail);

EmitStmt(S.getSubStmt(), S.getAttrs()); EmitStmt(S.getSubStmt(), S.getAttrs());

rsmithUnsubmitted

Done

In the case where we're forcibly eliding a constructor, we'll need to emit a return statement that returns musttail call expression here rather than emitting the original substatement. Otherwise the tail call we emit will be initializing a local temporary rather than initializing our return slot. Eg, given:

struct A {
  A(const A&);
  ~A();
  char data[32];
};
A f();
A g() {
  [[clang::musttail]] return f();
}

under -fno-elide-constructors when targeting C++11, say, we'll normally lower that into something like:

void f(A *return_slot);
void g(A *return_slot) {
  A temporary; //uninitialized
  f(&temporary); // call f
  A::A(return_slot, temporary); // call copy constructor to copy into return slot
}

... and with the current patch, it looks like we'll add a 'ret void' after the call to f, leaving g's return slot uninitialized and passing an address into f that refers to a variable that will no longer exist once f is called. We need to instead lower to:

void f(A *return_slot);
void g(A *return_slot) {
  f(return_slot); // call f
}

Probably the easiest way to do this would be to change the return value on the ReturnStmt to be the tail-called CallExpr when attaching the attribute.

rsmith: In the case where we're forcibly eliding a constructor, we'll need to emit a return statement…

habermanAuthorUnsubmitted

Done

Done.

I had to change your test case to remove the destructor, otherwise it fails the trivial destruction check.

Take a look at the CodeGen tests and see if the output looks correct to you.

haberman: Done. I had to change your test case to remove the destructor, otherwise it fails the trivial…

} }

void CodeGenFunction::EmitGotoStmt(const GotoStmt &S) { void CodeGenFunction::EmitGotoStmt(const GotoStmt &S) {

// If this code is reachable then emit a stop point (if generating // If this code is reachable then emit a stop point (if generating

// debug info). We have to do this ourselves because we are on the // debug info). We have to do this ourselves because we are on the

// "simple" statement path. // "simple" statement path.

if (HaveInsertPoint()) if (HaveInsertPoint())

EmitStopPoint(&S); EmitStopPoint(&S);

▲ Show 20 Lines • Show All 2,080 Lines • Show Last 20 Lines

clang/lib/CodeGen/CodeGenFunction.h

Show First 20 Lines • Show All 514 Lines • ▼ Show 20 Lines	public:

/// True if CodeGen currently emits code inside presereved access index		/// True if CodeGen currently emits code inside presereved access index
/// region.		/// region.
bool IsInPreservedAIRegion = false;		bool IsInPreservedAIRegion = false;

/// True if the current statement has nomerge attribute.		/// True if the current statement has nomerge attribute.
bool InNoMergeAttributedStmt = false;		bool InNoMergeAttributedStmt = false;

		// The CallExpr within the current statement that the musttail attribute
		// applies to. nullptr if there is no 'musttail' on the current statement.
		const CallExpr *MustTailCall = nullptr;

/// True if the current function should be marked mustprogress.		/// True if the current function should be marked mustprogress.
bool FnIsMustProgress = false;		bool FnIsMustProgress = false;

/// True if the C++ Standard Requires Progress.		/// True if the C++ Standard Requires Progress.
bool CPlusPlusWithProgress() {		bool CPlusPlusWithProgress() {
if (CGM.getCodeGenOpts().getFiniteLoops() ==		if (CGM.getCodeGenOpts().getFiniteLoops() ==
CodeGenOptions::FiniteLoopsKind::Never)		CodeGenOptions::FiniteLoopsKind::Never)
return false;		return false;
Show All 32 Lines	public:

EHScopeStack EHStack;		EHScopeStack EHStack;
llvm::SmallVector<char, 256> LifetimeExtendedCleanupStack;		llvm::SmallVector<char, 256> LifetimeExtendedCleanupStack;
llvm::SmallVector<const JumpDest *, 2> SEHTryEpilogueStack;		llvm::SmallVector<const JumpDest *, 2> SEHTryEpilogueStack;

llvm::Instruction *CurrentFuncletPad = nullptr;		llvm::Instruction *CurrentFuncletPad = nullptr;

class CallLifetimeEnd final : public EHScopeStack::Cleanup {		class CallLifetimeEnd final : public EHScopeStack::Cleanup {
		bool isRedundantBeforeReturn() override { return true; }

llvm::Value *Addr;		llvm::Value *Addr;
llvm::Value *Size;		llvm::Value *Size;

public:		public:
CallLifetimeEnd(Address addr, llvm::Value *size)		CallLifetimeEnd(Address addr, llvm::Value *size)
: Addr(addr.getPointer()), Size(size) {}		: Addr(addr.getPointer()), Size(size) {}

void Emit(CodeGenFunction &CGF, Flags flags) override {		void Emit(CodeGenFunction &CGF, Flags flags) override {
▲ Show 20 Lines • Show All 3,328 Lines • ▼ Show 20 Lines	public:
// Scalar Expression Emission		// Scalar Expression Emission
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// EmitCall - Generate a call of the given function, expecting the given		/// EmitCall - Generate a call of the given function, expecting the given
/// result type, and using the given argument list which specifies both the		/// result type, and using the given argument list which specifies both the
/// LLVM arguments and the types they were derived from.		/// LLVM arguments and the types they were derived from.
RValue EmitCall(const CGFunctionInfo &CallInfo, const CGCallee &Callee,		RValue EmitCall(const CGFunctionInfo &CallInfo, const CGCallee &Callee,
ReturnValueSlot ReturnValue, const CallArgList &Args,		ReturnValueSlot ReturnValue, const CallArgList &Args,
llvm::CallBase **callOrInvoke, SourceLocation Loc);		llvm::CallBase **callOrInvoke, bool IsMustTail,
		SourceLocation Loc);
RValue EmitCall(const CGFunctionInfo &CallInfo, const CGCallee &Callee,		RValue EmitCall(const CGFunctionInfo &CallInfo, const CGCallee &Callee,
ReturnValueSlot ReturnValue, const CallArgList &Args,		ReturnValueSlot ReturnValue, const CallArgList &Args,
llvm::CallBase **callOrInvoke = nullptr) {		llvm::CallBase **callOrInvoke = nullptr,
		bool IsMustTail = false) {
return EmitCall(CallInfo, Callee, ReturnValue, Args, callOrInvoke,		return EmitCall(CallInfo, Callee, ReturnValue, Args, callOrInvoke,
SourceLocation());		IsMustTail, SourceLocation());
}		}
RValue EmitCall(QualType FnType, const CGCallee &Callee, const CallExpr *E,		RValue EmitCall(QualType FnType, const CGCallee &Callee, const CallExpr *E,
ReturnValueSlot ReturnValue, llvm::Value *Chain = nullptr);		ReturnValueSlot ReturnValue, llvm::Value *Chain = nullptr);
RValue EmitCallExpr(const CallExpr *E,		RValue EmitCallExpr(const CallExpr *E,
ReturnValueSlot ReturnValue = ReturnValueSlot());		ReturnValueSlot ReturnValue = ReturnValueSlot());
RValue EmitSimpleCallExpr(const CallExpr *E, ReturnValueSlot ReturnValue);		RValue EmitSimpleCallExpr(const CallExpr *E, ReturnValueSlot ReturnValue);
CGCallee EmitCallee(const Expr *E);		CGCallee EmitCallee(const Expr *E);

▲ Show 20 Lines • Show All 892 Lines • Show Last 20 Lines

clang/lib/CodeGen/EHScopeStack.h

Show First 20 Lines • Show All 144 Lines • ▼ Show 20 Lines	public:
protected:		protected:
~Cleanup() = default;		~Cleanup() = default;

public:		public:
Cleanup(const Cleanup &) = default;		Cleanup(const Cleanup &) = default;
Cleanup(Cleanup &&) {}		Cleanup(Cleanup &&) {}
Cleanup() = default;		Cleanup() = default;

		virtual bool isRedundantBeforeReturn() { return false; }

/// Generation flags.		/// Generation flags.
class Flags {		class Flags {
enum {		enum {
F_IsForEH = 0x1,		F_IsForEH = 0x1,
F_IsNormalCleanupKind = 0x2,		F_IsNormalCleanupKind = 0x2,
F_IsEHCleanupKind = 0x4,		F_IsEHCleanupKind = 0x4,
F_HasExitSwitch = 0x8,		F_HasExitSwitch = 0x8,
};		};
▲ Show 20 Lines • Show All 258 Lines • Show Last 20 Lines

clang/lib/Sema/JumpDiagnostics.cpp

//===--- JumpDiagnostics.cpp - Protected scope jump analysis ------*- C++ -*-=//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

// This file implements the JumpScopeChecker class, which is used to diagnose

// jumps that enter a protected scope in an invalid way.

//===----------------------------------------------------------------------===//

#include "clang/Sema/SemaInternal.h"

#include "clang/AST/DeclCXX.h"

#include "clang/AST/Expr.h"

#include "clang/AST/ExprCXX.h"

#include "clang/AST/StmtCXX.h"

#include "clang/AST/StmtObjC.h"

#include "clang/AST/StmtOpenMP.h"

#include "clang/Basic/SourceLocation.h"

#include "clang/Sema/SemaInternal.h"

#include "llvm/ADT/BitVector.h"

using namespace clang;

namespace {

/// JumpScopeChecker - This object is used by Sema to diagnose invalid jumps

/// into VLA and other protected scopes. For example, this rejects:

/// goto L;

/// int a[n];

/// L:

///

/// We also detect jumps out of protected scopes when it's not possible to do

/// cleanups properly. Indirect jumps and ASM jumps can't do cleanups because

/// the target is unknown. Return statements with \c [[clang::musttail]] cannot

rsmithUnsubmitted

Done

/// cleanups properly. Indirect jumps and ASM jumps can't do cleanups because

- /// the target is unknown. Return statements with \c [musttail] cannot handle

+ /// the target is unknown. Return statements with \c [[clang::musttail]] cannot handle

/// any cleanups due to the nature of a tail call.

rsmith:

/// handle any cleanups due to the nature of a tail call.

class JumpScopeChecker {

Sema &S;

/// Permissive - True when recovering from errors, in which case precautions

/// are taken to handle incomplete scope information.

const bool Permissive;

/// GotoScope - This is a record that we use to keep track of all of the

Show All 23 Lines

class JumpScopeChecker {

};

SmallVector<GotoScope, 48> Scopes;

llvm::DenseMap<Stmt*, unsigned> LabelAndGotoScopes;

SmallVector<Stmt*, 16> Jumps;

SmallVector<Stmt*, 4> IndirectJumps;

SmallVector<Stmt*, 4> AsmJumps;

SmallVector<AttributedStmt *, 4> MustTailStmts;

SmallVector<LabelDecl*, 4> IndirectJumpTargets;

SmallVector<LabelDecl*, 4> AsmJumpTargets;

public:

JumpScopeChecker(Stmt *Body, Sema &S);

private:

void BuildScopeInformation(Decl *D, unsigned &ParentScope);

void BuildScopeInformation(VarDecl *D, const BlockDecl *BDecl,

unsigned &ParentScope);

void BuildScopeInformation(CompoundLiteralExpr *CLE, unsigned &ParentScope);

void BuildScopeInformation(Stmt *S, unsigned &origParentScope);

void VerifyJumps();

void VerifyIndirectOrAsmJumps(bool IsAsmGoto);

void VerifyMustTailStmts();

void NoteJumpIntoScopes(ArrayRef<unsigned> ToScopes);

void DiagnoseIndirectOrAsmJump(Stmt *IG, unsigned IGScope, LabelDecl *Target,

unsigned TargetScope);

void CheckJump(Stmt *From, Stmt *To, SourceLocation DiagLoc,

unsigned JumpDiag, unsigned JumpDiagWarning,

unsigned JumpDiagCXX98Compat);

void CheckGotoStmt(GotoStmt *GS);

const Attr *GetMustTailAttr(AttributedStmt *AS);

unsigned GetDeepestCommonScope(unsigned A, unsigned B);

};

} // end anonymous namespace

#define CHECK_PERMISSIVE(x) (assert(Permissive || !(x)), (Permissive && (x)))

JumpScopeChecker::JumpScopeChecker(Stmt *Body, Sema &s)

: S(s), Permissive(s.hasAnyUnrecoverableErrorsInThisFunction()) {

// Add a scope entry for function scope.

Scopes.push_back(GotoScope(~0U, ~0U, ~0U, SourceLocation()));

// Build information for the top level compound statement, so that we have a

// defined scope record for every "goto" and label.

unsigned BodyParentScope = 0;

BuildScopeInformation(Body, BodyParentScope);

// Check that all jumps we saw are kosher.

VerifyJumps();

VerifyIndirectOrAsmJumps(false);

VerifyIndirectOrAsmJumps(true);

VerifyMustTailStmts();

}

/// GetDeepestCommonScope - Finds the innermost scope enclosing the

/// two scopes.

unsigned JumpScopeChecker::GetDeepestCommonScope(unsigned A, unsigned B) {

while (A != B) {

// Inner scopes are created after outer scopes and therefore have

// higher indices.

▲ Show 20 Lines • Show All 455 Lines • ▼ Show 20 Lines

void JumpScopeChecker::BuildScopeInformation(Stmt *S,

}

case Stmt::CaseStmtClass:

case Stmt::DefaultStmtClass:

case Stmt::LabelStmtClass:

LabelAndGotoScopes[S] = ParentScope;

break;

case Stmt::AttributedStmtClass: {

AttributedStmt *AS = cast<AttributedStmt>(S);

if (GetMustTailAttr(AS)) {

LabelAndGotoScopes[AS] = ParentScope;

MustTailStmts.push_back(AS);

}

break;

}

default:

if (auto *ED = dyn_cast<OMPExecutableDirective>(S)) {

if (!ED->isStandaloneDirective()) {

unsigned NewParentScope = Scopes.size();

Scopes.emplace_back(ParentScope,

diag::note_omp_protected_structured_block,

diag::note_omp_exits_structured_block,

ED->getStructuredBlock()->getBeginLoc());

▲ Show 20 Lines • Show All 375 Lines • ▼ Show 20 Lines

void JumpScopeChecker::CheckGotoStmt(GotoStmt *GS) {

if (GS->getLabel()->isMSAsmLabel()) {

S.Diag(GS->getGotoLoc(), diag::err_goto_ms_asm_label)

<< GS->getLabel()->getIdentifier();

S.Diag(GS->getLabel()->getLocation(), diag::note_goto_ms_asm_label)

<< GS->getLabel()->getIdentifier();

}

void JumpScopeChecker::VerifyMustTailStmts() {

for (AttributedStmt *AS : MustTailStmts) {

for (unsigned I = LabelAndGotoScopes[AS]; I; I = Scopes[I].ParentScope) {

if (Scopes[I].OutDiag) {

S.Diag(AS->getBeginLoc(), diag::err_musttail_scope);

S.Diag(Scopes[I].Loc, Scopes[I].OutDiag);

}

const Attr *JumpScopeChecker::GetMustTailAttr(AttributedStmt *AS) {

ArrayRef<const Attr *> Attrs = AS->getAttrs();

const auto *Iter =

llvm::find_if(Attrs, [](const Attr *A) { return isa<MustTailAttr>(A); });

return Iter != Attrs.end() ? *Iter : nullptr;

}

void Sema::DiagnoseInvalidJumps(Stmt *Body) {

(void)JumpScopeChecker(Body, *this);

aaron.ballmanUnsubmitted

Done

const Attr *JumpScopeChecker::GetMustTailAttr(AttributedStmt *AS) {

- for (const auto *A : AS->getAttrs()) {

- if (A->getKind() == attr::MustTail) {

- return A;

- }

- return nullptr;

+ ArrayRef<const Attr *> Attrs = AS->getAttrs();

+ auto Iter = llvm::find_if(Attrs, [](const Attr *A) { return isa<MustTailAttr>(A); });

+ return Iter != Attrs->end() ? *Iter : nullptr;

}

void Sema::DiagnoseInvalidJumps(Stmt *Body) {

aaron.ballman:

}

clang/lib/Sema/Sema.cpp

Show First 20 Lines • Show All 2,073 Lines • ▼ Show 20 Lines	if (!FunctionScopes.empty())
FunctionScopes.back()->setHasBranchProtectedScope();		FunctionScopes.back()->setHasBranchProtectedScope();
}		}

void Sema::setFunctionHasIndirectGoto() {		void Sema::setFunctionHasIndirectGoto() {
if (!FunctionScopes.empty())		if (!FunctionScopes.empty())
FunctionScopes.back()->setHasIndirectGoto();		FunctionScopes.back()->setHasIndirectGoto();
}		}

		void Sema::setFunctionHasMustTail() {
		if (!FunctionScopes.empty())
		FunctionScopes.back()->setHasMustTail();
		}

BlockScopeInfo *Sema::getCurBlock() {		BlockScopeInfo *Sema::getCurBlock() {
if (FunctionScopes.empty())		if (FunctionScopes.empty())
return nullptr;		return nullptr;

auto CurBSI = dyn_cast<BlockScopeInfo>(FunctionScopes.back());		auto CurBSI = dyn_cast<BlockScopeInfo>(FunctionScopes.back());
if (CurBSI && CurBSI->TheDecl &&		if (CurBSI && CurBSI->TheDecl &&
!CurBSI->TheDecl->Encloses(CurContext)) {		!CurBSI->TheDecl->Encloses(CurContext)) {
// We have switched contexts due to template instantiation.		// We have switched contexts due to template instantiation.
▲ Show 20 Lines • Show All 490 Lines • Show Last 20 Lines

clang/lib/Sema/SemaStmt.cpp

//===--- SemaStmt.cpp - Semantic Analysis for Statements ------------------===// //===--- SemaStmt.cpp - Semantic Analysis for Statements ------------------===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// //

// This file implements semantic analysis for statements. // This file implements semantic analysis for statements.

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#include "clang/Sema/Ownership.h"

#include "clang/Sema/SemaInternal.h"

#include "clang/AST/ASTContext.h" #include "clang/AST/ASTContext.h"

#include "clang/AST/ASTDiagnostic.h" #include "clang/AST/ASTDiagnostic.h"

#include "clang/AST/ASTLambda.h" #include "clang/AST/ASTLambda.h"

#include "clang/AST/CharUnits.h"

#include "clang/AST/CXXInheritance.h" #include "clang/AST/CXXInheritance.h"

#include "clang/AST/CharUnits.h"

#include "clang/AST/DeclObjC.h" #include "clang/AST/DeclObjC.h"

#include "clang/AST/EvaluatedExprVisitor.h" #include "clang/AST/EvaluatedExprVisitor.h"

#include "clang/AST/ExprCXX.h" #include "clang/AST/ExprCXX.h"

#include "clang/AST/ExprObjC.h" #include "clang/AST/ExprObjC.h"

#include "clang/AST/IgnoreExpr.h"

#include "clang/AST/RecursiveASTVisitor.h" #include "clang/AST/RecursiveASTVisitor.h"

#include "clang/AST/StmtCXX.h" #include "clang/AST/StmtCXX.h"

#include "clang/AST/StmtObjC.h" #include "clang/AST/StmtObjC.h"

#include "clang/AST/TypeLoc.h" #include "clang/AST/TypeLoc.h"

#include "clang/AST/TypeOrdering.h" #include "clang/AST/TypeOrdering.h"

#include "clang/Basic/TargetInfo.h" #include "clang/Basic/TargetInfo.h"

#include "clang/Lex/Preprocessor.h" #include "clang/Lex/Preprocessor.h"

#include "clang/Sema/Initialization.h" #include "clang/Sema/Initialization.h"

#include "clang/Sema/Lookup.h" #include "clang/Sema/Lookup.h"

#include "clang/Sema/Ownership.h"

#include "clang/Sema/Scope.h" #include "clang/Sema/Scope.h"

#include "clang/Sema/ScopeInfo.h" #include "clang/Sema/ScopeInfo.h"

#include "clang/Sema/SemaInternal.h"

#include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/ArrayRef.h"

#include "llvm/ADT/DenseMap.h" #include "llvm/ADT/DenseMap.h"

#include "llvm/ADT/STLExtras.h" #include "llvm/ADT/STLExtras.h"

#include "llvm/ADT/SmallPtrSet.h" #include "llvm/ADT/SmallPtrSet.h"

#include "llvm/ADT/SmallString.h" #include "llvm/ADT/SmallString.h"

#include "llvm/ADT/SmallVector.h" #include "llvm/ADT/SmallVector.h"

using namespace clang; using namespace clang;

▲ Show 20 Lines • Show All 510 Lines • ▼ Show 20 Lines if (!TheDecl->isGnuLocal()) {

} }

return LS; return LS;

} }

StmtResult Sema::BuildAttributedStmt(SourceLocation AttrsLoc, StmtResult Sema::BuildAttributedStmt(SourceLocation AttrsLoc,

ArrayRef<const Attr *> Attrs, ArrayRef<const Attr *> Attrs,

Stmt *SubStmt) { Stmt *SubStmt) {

// FIXME: move this code should move when a planned refactoring around

// statement attributes lands.

for (const auto *A : Attrs) {

if (A->getKind() == attr::MustTail) {

if (!checkAndRewriteMustTailAttr(SubStmt, *A)) {

return SubStmt;

}

setFunctionHasMustTail();

aaron.ballmanUnsubmitted

Done

This functionality belongs in SemaStmtAttr.cpp, I think.

aaron.ballman: This functionality belongs in SemaStmtAttr.cpp, I think.

habermanAuthorUnsubmitted

Done

That is where I had originally put it, but that didn't work for templates. The semantic checks can only be performed at instantiation time. ActOnAttributedStmt seems to be the right hook point where I can evaluate the semantic checks for both template and non-template functions (with template functions getting checked at instantiation time).

haberman: That is where I had originally put it, but that didn't work for templates. The semantic checks…

aaron.ballmanUnsubmitted

Done

I disagree that ActOnAttributedStmt() is the correct place for this checking -- template checking should occur when the template is instantiated, same as happens for declaration attributes. I'd like to see this functionality moved to SemaStmtAttr.cpp. Keeping the attribute logic together and following the same patterns is what allows us to tablegenerate more of the attribute logic. Statement attributes are just starting to get more such automation.

aaron.ballman: I disagree that `ActOnAttributedStmt()` is the correct place for this checking -- template…

habermanAuthorUnsubmitted

Done

I tried commenting out this code and adding the following code into handleMustTailAttr() in SemaStmtAttr.cpp:

if (!S.checkMustTailAttr(St, MTA))
  return nullptr;

This caused my test cases related to templates to fail. It also seemed to break test cases related to JumpDiagnostics. My interpretation of this is that handleMustTailAttr() is called during parsing only, and cannot catch errors at template instantiation time or that require a more complete AST.

What am I missing? Where in SemaStmtAttr.cpp are you suggesting that I put this check?

haberman: I tried commenting out this code and adding the following code into `handleMustTailAttr()` in…

habermanAuthorUnsubmitted

Done

Scratch the part about JumpDiagnostics, that was me failing to call S.setFunctionHasMustTail(). I added that and now the JumpDiagnostics tests pass.

But the template test cases still fail, and I can't find any hook point in SemaStmtAttr.cpp that will let me evaluate these checks at template instantiation time.

haberman: Scratch the part about `JumpDiagnostics`, that was me failing to call `S.setFunctionHasMustTail…

aaron.ballmanUnsubmitted

Done

I think there's a bit of an architectural mixup, but I'm curious if @rsmith agrees before anyone starts doing work to make changes.

When transforming declarations, RebuildWhatever() calls the ActOnWhatever() function which calls ProcessDeclAttributeList() so that attributes are processed. RebuildAttributedStmt() similarly calls ActOnAttributedStmt(). However, ActOnAttributedStmt() doesn't call ProcessStmtAttributes() -- the logic is reversed so that ProcessStmtAttributes() is what calls ActOnAttributedStmt().

I think the correct answer is to switch the logic so that ActOnAttributedStmt() calls ProcessStmtAttributes(), then the template logic should automatically work.

aaron.ballman: I think there's a bit of an architectural mixup, but I'm curious if @rsmith agrees before…

habermanAuthorUnsubmitted

Done

I think the correct answer is to switch the logic so that ActOnAttributedStmt() calls ProcessStmtAttributes()

I think this would require ProcessStmtAttributes() to be split into two separate functions. Currently that function is doing two separate things:

Translation of ParsedAttr into various subclasses of Attr.
Validation that the attribute is semantically valid.

The function signature for ActOnAttributedStmt() uses Attr (not ParsedAttr), so (1) must happen during the parse, before ActOnAttributedStmt() is called. But (2) must be deferred until template instantiation time for some cases, like musttail.

haberman: > I think the correct answer is to switch the logic so that ActOnAttributedStmt() calls…

aaron.ballmanUnsubmitted

Done

I don't think the signature for ActOnAttributedStmt() is correct to use Attr instead of ParsedAttr. I think it should be StmtResult ActOnAttributedStmt(const ParsedAttributesViewWithRange &AttrList, Stmt *SubStmt); -- this likely requires a fair bit of surgery to make work though, which is why I'd like to hear from @rsmith if he agrees with the approach. In the meantime, I'll play around with this idea locally in more depth.

aaron.ballman: I don't think the signature for `ActOnAttributedStmt()` is correct to use `Attr` instead of…

aaron.ballmanUnsubmitted

Done

I think my suggestion wasn't quite right, but close. I've got a patch in progress that changes this the way I was thinking it should be changed, but it won't call ActOnAttributedStmt() when doing template instantiation. Instead, it will continue to instantiate attributes explicitly by calling TransformAttr() and any additional instantiation time checks will require you to add a TreeTransfor::TransformWhateverAttr() to do the actual instantiation work (which is similar to how the declaration attributes work in Sema::InstantiateAttrs()).

I hope to put up a patch for review for these changes today or tomorrow. It'd be interesting to know whether they make your life easier or harder though, if you don't mind taking a look and seeing how well (or poorly) they integrate with your changes here.

aaron.ballman: I think my suggestion wasn't quite right, but close. I've got a patch in progress that changes…

rsmithUnsubmitted

Done

I think the ideal model would be that we form a FooAttr from the user-supplied attribute description in an ActOn* function from the parser, and have a separate template instantiation mechanism to instantiate FooAttr objects, and those methods are unaware of the subject of the attribute. Then we have a separate mechanism to attach an attribute to its subjects that is used by both parsing and template instantiation. But I suspect there are reasons that doesn't work in practice -- where we need to know something about the subject in order to know how to form the FooAttr. That being the case, it probably makes most sense to model the formation and application of a FooAttr as a single process.

it won't call ActOnAttributedStmt() when doing template instantiation

Good -- not calling ActOn* during template instantiation is the right choice in general -- the ActOn* functions are only supposed to be called from parsing, with a Build* added if the parsing and template instantiation paths would share code (we sometimes shortcut that when the ActOn* and Build* would be identical, but I think that's turned out to be a mistake).

any additional instantiation time checks will require you to add a TreeTransform::TransformWhateverAttr() to do the actual instantiation work

That sounds appropriate to me in general. Are you expecting that this function would also be given the (transformed and perhaps original) subject of the attribute?

rsmith: I think the ideal model would be that we form a `FooAttr` from the user-supplied attribute…

aaron.ballmanUnsubmitted

Done

You can find that review at https://reviews.llvm.org/D99896.

aaron.ballman: You can find that review at https://reviews.llvm.org/D99896.

habermanAuthorUnsubmitted

Done

Would it be possible to defer that refactoring until after this change is in? There are a lot of other issues to resolve on this review as it is, and throwing a potential refactoring into the mix is making it a lot harder to get this into a state where it can be landed.

Once it's in I'm happy to collaborate on the other review.

haberman: Would it be possible to defer that refactoring until after this change is in? There are a lot…

aaron.ballmanUnsubmitted

Done

I'm fine with that -- my suggestion would be to ignore the template instantiation validation for the moment (add tests with FIXME comments where the behavior isn't what you want) and then when I get you the functionality you need to have more unified checking, you can refactor it at that time.

aaron.ballman: I'm fine with that -- my suggestion would be to ignore the template instantiation validation…

habermanAuthorUnsubmitted

Done

I would strongly prefer to submit correct code (that validates templates) and leave a FIXME to make it pretty, rather than submit pretty code and leave a FIXME to make it correct.

haberman: I would strongly prefer to submit correct code (that validates templates) and leave a FIXME to…

aaron.ballmanUnsubmitted

Done

I'm okay with that so long as the follow-up work actually happens (not to suggest that you plan to ignore the request!). "This is functional but not pretty" has a risk of becoming enshrined behavior as priorities shift, whereas "this is incomplete" generally does not.

Please add a FIXME comment here just to make sure it's clear we want the code to move in the future.

aaron.ballman: I'm okay with that so long as the follow-up work actually happens (not to suggest that you plan…

habermanAuthorUnsubmitted

Done

I added a FIXME. Just to set expectations, I'm happy to work with you on updating this code to fit your planned refactoring (either by offering comments/suggestions on a review by you or creating my own follow-up review per your suggestions). But I'll need a fair amount of input from you, since I don't fully grok what you find objectionable about the current code or what your desired end state is.

haberman: I added a FIXME. Just to set expectations, I'm happy to work with you on updating this code to…

aaron.ballmanUnsubmitted

Done

Thanks for the FIXME. I'm totally happy to iterate with you on the refactoring. Mostly, it involves testing whether https://reviews.llvm.org/D99983 provides you with enough contextual information when performing template instantiation for you to be able to put the attribute checking logic into the right places.

The objectionable bit about the current approach is that ActOnAttributedStmt()/BuildAttributedStmt() are general functions for attributed statements that should not be doing per-attribute diagnostic work (this won't scale well as more statement attributes get added). My preferred approach based on what you have already is to call checkMustTailAttr() from handleMustTailAttr(), and call it from TreeTransform.h in a new TransformMustTailAttr() function when doing template instantiation (this part is what requires the other patch to land first).

aaron.ballman: Thanks for the FIXME. I'm totally happy to iterate with you on the refactoring. Mostly, it…

habermanAuthorUnsubmitted

Done

Sounds good. I will follow up with you on https://reviews.llvm.org/D99983.

haberman: Sounds good. I will follow up with you on https://reviews.llvm.org/D99983.

}

return AttributedStmt::Create(Context, AttrsLoc, Attrs, SubStmt); return AttributedStmt::Create(Context, AttrsLoc, Attrs, SubStmt);

} }

StmtResult Sema::ActOnAttributedStmt(const ParsedAttributesWithRange &Attrs, StmtResult Sema::ActOnAttributedStmt(const ParsedAttributesWithRange &Attrs,

Stmt *SubStmt) { Stmt *SubStmt) {

SmallVector<const Attr *, 1> SemanticAttrs; SmallVector<const Attr *, 1> SemanticAttrs;

ProcessStmtAttributes(SubStmt, Attrs, SemanticAttrs); ProcessStmtAttributes(SubStmt, Attrs, SemanticAttrs);

if (!SemanticAttrs.empty()) if (!SemanticAttrs.empty())

return BuildAttributedStmt(Attrs.Range.getBegin(), SemanticAttrs, SubStmt); return BuildAttributedStmt(Attrs.Range.getBegin(), SemanticAttrs, SubStmt);

// If none of the attributes applied, that's fine, we can recover by // If none of the attributes applied, that's fine, we can recover by

// returning the substatement directly instead of making an AttributedStmt // returning the substatement directly instead of making an AttributedStmt

// with no attributes on it. // with no attributes on it.

return SubStmt; return SubStmt;

} }

bool Sema::checkAndRewriteMustTailAttr(Stmt *St, const Attr &MTA) {

ReturnStmt *R = cast<ReturnStmt>(St);

Expr *E = R->getRetValue();

if (CurContext->isDependentContext() || (E && E->isInstantiationDependent()))

// We have to suspend our check until template instantiation time.

return true;

rsmithUnsubmitted

Done

It's a bit awkward, but I think we should delay this check until after the others -- complaining about non-trivial destruction seems beside the point if the returned value isn't a function call.

Also, the diagnostic text for this error seems narrower than the cases it covers. For example:

void f(const char*);
void g(const char *s) {
  [[clang::musttail]] return f((s + "foo"s).c_str());
}

would be diagnosed as "attribute requires that the return type and all arguments are trivially destructible", and they are; the problem is that the return value creates a temporary object with non-trivial destruction.

rsmith: It's a bit awkward, but I think we should delay this check until after the others…

aaron.ballmanUnsubmitted

Done

This check should not be necessary once the code moves to SemaStmtAttr.cpp and Attr.td gets a correct subjects line.

aaron.ballman: This check should not be necessary once the code moves to SemaStmtAttr.cpp and Attr.td gets a…

if (!checkMustTailAttr(St, MTA))

return false;

// FIXME: Replace Expr::IgnoreImplicitAsWritten() with this function.

rsmithUnsubmitted

Done

I would have thought this assert would fire for void f() { [[clang::musttail]] return; }. If so, we should reject this case with a diagnostic.

rsmith: I would have thought this assert would fire for `void f() { [[clang::musttail]] return; }`. If…

// Currently it does not skip implicit constructors in an initialization

// context.

rsmithUnsubmitted

Done

IgnoreUnlessSpelledInSource is a syntactic check that's only really intended for tooling use cases; I think we want something a bit more semantic here, so IgnoreImplicitAsWritten would be more appropriate.

I think it would be reasonable to also skip "parentheses" here (which we treat as also including things like C's _Generic). Would Ex->IgnoreImplicitAsWritten()->IgnoreParens() work?

If we're going to skip elidable copy construction of the result here (which I think we should), should we also reflect that in the AST? Perhaps we should strip the return value down to being just the call expression? I'm thinking in particular of things like building in C++14 or before with -fno-elide-constructors, where code generation for a by-value return of a class object will synthesize a local temporary to hold the result, with a final destination copy emitted after the call. (Testcase: struct A { A(const A&); }; A f(); A g() { [[clang::musttail]] return f(); } with -fno-elide-constructors.)

rsmith: `IgnoreUnlessSpelledInSource` is a syntactic check that's only really intended for tooling use…

habermanAuthorUnsubmitted

Done

IgnoreImplicitAsWritten() doesn't skip ExprWithCleanups, and per your previous comment I was trying to find a CallExpr before doing the check prohibiting ExprWithCleanups with side effects.

I could write some custom ignore logic using clang::IgnoreExprNodes() directly.

If we're going to skip elidable copy construction of the result here (which I think we should)

To clarify, are you suggesting that we allow musttail through elidable copy constructors on the return value, even if -fno-elide-constructors is set? ie. we consider that musttail overrides the -fno-elide-constructors option on the command line?

haberman: `IgnoreImplicitAsWritten()` doesn't skip `ExprWithCleanups`, and per your previous comment I…

rsmithUnsubmitted

Done

IgnoreImplicitAsWritten() doesn't skip ExprWithCleanups

That sounds like a bug. Are you sure? It looks like IgnoreImplicitAsWrittenSingleStep calls IgnoreImplicitSingleStep which calls IgnoreImplicitCastsSingleStep which skips FullExpr, and ExprWithCleanups is a kind of FullExpr.

To clarify, are you suggesting that we allow musttail through elidable copy constructors on the return value, even if -fno-elide-constructors is set? ie. we consider that musttail overrides the -fno-elide-constructors option on the command line?

Yes, I think the musttail attribute should override -fno-elide-constructors, because that's necessary in order to provide the tail call the user requested (and the local setting should override the global one). This is probably worth adding to the documentation.

(Also, -fno-elide-constructors is only supposed to affect code generation, not language semantics or program validity, so I think either we should always reject if a constructor call is required for the return value, regardless of whether it's elidable, or we should never reject in that case, and either way this determination should be made independent of the setting of -fno-elide-constructors. Given that choice, it seems more useful to bias towards the common case (-felide-constructors).)

rsmith: > `IgnoreImplicitAsWritten()` doesn't skip `ExprWithCleanups` That sounds like a bug. Are you…

auto IgnoreImplicitAsWritten = [](Expr *E) -> Expr * {

return IgnoreExprNodes(E, IgnoreImplicitAsWrittenSingleStep,

IgnoreElidableImplicitConstructorSingleStep);

aaron.ballmanUnsubmitted

Done

return false;

}

- const Expr *Ex = R->getRetValue();

+ const Expr *Ex = R->getRetValue()->IgnoreParenImpCasts();

- // We don't actually support tail calling through an implicit cast (we require

- // the return types to match), but getting the actual function call will let

- // us give a better error message about the return type mismatch.

- if (const ImplicitCastExpr *ICE = dyn_cast<ImplicitCastExpr>(Ex)) {

- Ex = ICE->getSubExpr();

- }

const CallExpr *CE = dyn_cast<CallExpr>(Ex);

I think you want to ignore parens and implicit casts here. e.g., there's no reason to diagnose code like:

int foo();
int bar() {
  [[clang::musttail]] return (bar());
}

aaron.ballman: I think you want to ignore parens and implicit casts here. e.g., there's no reason to diagnose…

rsmithUnsubmitted

Done

ReturnStmt *R = cast<ReturnStmt>(St);

- R->setRetValue(IgnoreImplicitAsWritten(R->getRetValue()));

- Expr *Ex = R->getRetValue();

+ Expr *Ex = IgnoreImplicitAsWritten(R->getRetValue());

+ R->setRetValue(Ex);

while (!isa<CallExpr>(Ex)) {

I think this would be clearer, assuming it's equivalent (and if it's not equivalent, I think it'd be useful to include a comment explaining why).

rsmith: I think this would be clearer, assuming it's equivalent (and if it's not equivalent, I think…

};

// Now that we have verified that 'musttail' is valid here, rewrite the

// return value to remove all implicit nodes, but retain parentheses.

R->setRetValue(IgnoreImplicitAsWritten(E));

aaron.ballmanUnsubmitted

Done

if (!CE) {

- Diag(St->getBeginLoc(), diag::err_musttail_needs_call) << MTA.getSpelling();

+ Diag(St->getBeginLoc(), diag::err_musttail_needs_call) << MTA;

return false;

aaron.ballman:

rsmithUnsubmitted

Done

A call expression doesn't necessarily have a known callee declaration. I would expect this assert to fire on a case like:

void f() {
  void (*p)() = f;
  [[clang::musttail]] return p();
}

We should reject this with a diagnostic.

rsmith: A call expression doesn't necessarily have a known callee declaration. I would expect this…

habermanAuthorUnsubmitted

Done

I think this case will work actually, the callee decl in this case is just the function pointer, which seems appropriate and type checks correctly.

I added a test for this.

haberman: I think this case will work actually, the callee decl in this case is just the function pointer…

rsmithUnsubmitted

Done

This loop is problematic: it's generally not safe to modify an expression that is used as a subexpression of another expression. (Modifying the ReturnStmt is, by contrast, much less problematic because the properties of a statement have less complex dependencies on the properties of its subexpressions.) In particular, if there were any implicit conversions here that changed the type or value category or similar, the enclosing parentheses would have the wrong type / value category / similar. Also there are possibilities here other than CallExpr and ParenExpr, such as anything else that we consider to be "parentheses" (such as a GenericSelectionExpr).

But I think this loop should never be necessary, because all implicit conversions should always be on the outside of the parentheses. Do you have a testcase that needs it?

rsmith: This loop is problematic: it's generally not safe to modify an expression that is used as a…

habermanAuthorUnsubmitted

Done

I removed it and my test cases still pass. I'm glad to know this isn't necessary: I was coding defensively because I didn't know that I could count on this invariant:

all implicit conversions should always be on the outside of the parentheses.

haberman: I removed it and my test cases still pass. I'm glad to know this isn't necessary: I was coding…

return true;

}

bool Sema::checkMustTailAttr(const Stmt *St, const Attr &MTA) {

assert(!CurContext->isDependentContext() &&

"musttail cannot be checked from a dependent context");

rsmithUnsubmitted

Done

You shouldn't need the const in the argument to cast, and we generally omit it; cast copies the pointer/referenceness and qualifiers from its argument anyway, and the explicit const in the type of R seems sufficient for readers. (I'm not even sure if cast intends to permit explicit qualfiiers here.)

rsmith: You shouldn't need the `const` in the argument to `cast`, and we generally omit it; `cast`…

// FIXME: Add Expr::IgnoreParenImplicitAsWritten() with this definition.

auto IgnoreParenImplicitAsWritten = [](const Expr *E) -> const Expr * {

rsmithUnsubmitted

Done

return true;

- const ReturnStmt *R = cast<const ReturnStmt>(St);

+ const auto *R = cast<ReturnStmt>(St);

const Expr *Ex = R->getRetValue();

... would be more in line with our normal idioms.

rsmith: ... would be more in line with our normal idioms.

return IgnoreExprNodes(const_cast<Expr *>(E), IgnoreParensSingleStep,

IgnoreImplicitAsWrittenSingleStep,

aaron.ballmanUnsubmitted

Done

Diag(St->getBeginLoc(), diag::err_musttail_only_from_function)

- << MTA.getSpelling();

+ << MTA;

return false;

aaron.ballman:

IgnoreElidableImplicitConstructorSingleStep);

};

const Expr *E = cast<ReturnStmt>(St)->getRetValue();

const auto *CE = dyn_cast_or_null<CallExpr>(IgnoreParenImplicitAsWritten(E));

aaron.ballmanUnsubmitted

Done

return false;

- } else if (CallerDecl->isDependentContext()) {

+ }

+ if (CallerDecl->isDependentContext())

// We have to suspend our check until template instantiation time.

return true;

- }

// Detect member function calls, inspired by Expr::findBoundMemberType().

aaron.ballman:

if (!CE) {

Diag(St->getBeginLoc(), diag::err_musttail_needs_call) << &MTA;

aaron.ballmanUnsubmitted

Done

IgnoreElidableImplicitConstructorSingleStep);

};

- const CallExpr *CE =

+ const auto *CE =

dyn_cast_or_null<CallExpr>(IgnoreParenImplicitAsWritten(Ex));

aaron.ballman:

return false;

}

aaron.ballmanUnsubmitted

Done

// type of "this".

- if (const MemberExpr *mem = dyn_cast<MemberExpr>(Callee)) {

+ if (const MemberExpr *Mem = dyn_cast<MemberExpr>(Callee)) {

// Call is: obj.method() or obj->method()

aaron.ballman:

if (const auto *EWC = dyn_cast<ExprWithCleanups>(E)) {

if (EWC->cleanupsHaveSideEffects()) {

aaron.ballmanUnsubmitted

Done

Is this assertion valid? Consider:

struct S {
  static int foo();
};

int bar() {
  S s;
  [[clang::musttail]] return s.foo();
}

aaron.ballman: Is this assertion valid? Consider: ``` struct S { static int foo(); }; int bar() { S s…

habermanAuthorUnsubmitted

Done

I have a test for that in CodeGen/attr-musttail.cpp (see Func4()). It appears that this goes through FunctionToPointerDecay and ends up getting handled by the function case below.

haberman: I have a test for that in `CodeGen/attr-musttail.cpp` (see `Func4()`). It appears that this…

rsmithUnsubmitted

Done

Please pass in a flag here so the diagnostic can %select and produce a more specific description of the problem.

rsmith: Please pass in a flag here so the diagnostic can `%select` and produce a more specific…

Diag(St->getBeginLoc(), diag::err_musttail_needs_trivial_args) << &MTA;

return false;

}

aaron.ballmanUnsubmitted

Done

This worries me slightly -- not all CallExpr objects have a callee declaration (https://github.com/llvm/llvm-project/blob/main/clang/lib/AST/Expr.cpp#L1367). That said, I'm struggling to come up with an example that isn't covered so this may be fine.

aaron.ballman: This worries me slightly -- not all `CallExpr` objects have a callee declaration (https…

habermanAuthorUnsubmitted

Done

That was my experience too, I wasn't able to find a case that isn't covered. I tried to avoid adding any diagnostics that I didn't know how to trigger or test.

haberman: That was my experience too, I wasn't able to find a case that isn't covered. I tried to avoid…

rsmithUnsubmitted

Done

This assert is incorrect. It would fail for a case like:

using T = int();
T *f();
int g() { [[clang::musttail]] return f()(); }

... where there is no declaration associated with the function pointer returned by f().

I think instead of looking for a callee declaration, you should instead inspect the callee expression. You can distinguish between a member function call and a non-member call by looking at the type of the callee. Perhaps the simplest way would be to distinguish between three cases:

(1) There is a callee declaration, which is a member function: this is a direct call to a member function; you can use the type of the callee declaration for your check.
(2) The callee expression is (after skipping parens) a pointer-to-member access operator (BinaryOperator::isPtrMemOp); you can use the type of the RHS operand (which will be a pointer to member function) for your check.
(3) Anything else: this is a non-member-function call, and you can directly inspect the type of the callee without caring about the callee declaration. (You might still find the type is not a function type at this stage, which indicates this is some kind of special form. In particular, it could be a BuiltinType::BoundMember for a pseudo-destructor call. I'm not sure if there are currently any other special cases that make it this far; there might not be, because most such cases are dependent.)

rsmith: This assert is incorrect. It would fail for a case like: ``` using T = int(); T *f(); int g()…

// We need to determine the full function type (including "this" type, if any)

// for both caller and callee.

struct FuncType {

aaron.ballmanUnsubmitted

Done

return false;

}

- if (const ExprWithCleanups *EWC = dyn_cast<ExprWithCleanups>(Ex)) {

+ if (const auto *EWC = dyn_cast<ExprWithCleanups>(Ex)) {

if (EWC->cleanupsHaveSideEffects()) {

aaron.ballman:

enum {

ft_non_member,

ft_static_member,

rsmithUnsubmitted

Done

There are a couple of other contexts that can include a return statement: the caller could also be an ObjCMethodDecl (an Objective-C method) or a CapturedDecl (the body of a #pragma omp parallel region). I'd probably use a specific diagnostic ("cannot be used from a block" / "cannot be used from an Objective-C function") for the block and ObjCMethod case, and a nonsepcific-but-correct "cannot be used from this context" for anything else.

rsmith: There are a couple of other contexts that can include a return statement: the caller could also…

rjmccallUnsubmitted

Done

Blocks ought to be extremely straightforward to support. Just validate that the tail call is to a block pointer and then compare the underlying function types line up in the same way. You will need to be able to verify that there isn't a non-trivial conversion on the return types, even if the return type isn't known at this point in the function, but that's a problem in C++ as well due to lambdas and auto deduced return types.

Also, you can use isa<...> for checks like this instead of dyn_cast<...>.

rjmccall: Blocks ought to be extremely straightforward to support. Just validate that the tail call is…

habermanAuthorUnsubmitted

Done

Tail calls to a block are indeed straightforward and are handled below. This check is for tail calls from a block, which I tried to add support for but didn't have much luck (in particular, during parsing of a block I wasn't able to get good type information for the block).

I'd probably use a specific diagnostic ("cannot be used from a block" / "cannot be used from an Objective-C function") for the block and ObjCMethod case, and a nonsepcific-but-correct "cannot be used from this context" for anything else.

I implemented this as requested. I wasn't able to test OpenMP as you apparently can't return from an OpenMP block.

haberman: Tail calls to a block are indeed straightforward and are handled below. This check is for tail…

ft_non_static_member,

ft_pointer_to_member,

} MemberType = ft_non_member;

QualType This;

const FunctionProtoType *Func;

const CXXMethodDecl *Method = nullptr;

} CallerType, CalleeType;

auto GetMethodType = [this, St, MTA](const CXXMethodDecl *CMD, FuncType &Type,

aaron.ballmanUnsubmitted

Done

I think this could assert on K&R C declarations because those don't have a prototype. e.g.,

int f(); // Important: compile this as C code, not C++ code

int bar(void) {
  [[clang::musttail]] return f();
}

aaron.ballman: I think this could assert on K&R C declarations because those don't have a prototype. e.g., ```…

bool IsCallee) -> bool {

aaron.ballmanUnsubmitted

Done

} CallerType, CalleeType;

- const FunctionDecl *CallerDecl = dyn_cast<FunctionDecl>(CurContext);

+ const auto *CallerDecl = dyn_cast<FunctionDecl>(CurContext);

auto GetMethodType = [this, St, MTA](const CXXMethodDecl *CMD, FuncType &Type,

aaron.ballman:

if (isa<CXXConstructorDecl, CXXDestructorDecl>(CMD)) {

Diag(St->getBeginLoc(), diag::err_musttail_structors_forbidden)

<< IsCallee << isa<CXXDestructorDecl>(CMD);

if (IsCallee)

aaron.ballmanUnsubmitted

Done

bool IsCallee) -> bool {

- if (isa<CXXConstructorDecl>(CMD) || isa<CXXDestructorDecl>(CMD)) {

+ if (isa<CXXConstructorDecl, CXXDestructorDecl>(CMD)) {

Diag(St->getBeginLoc(), diag::err_musttail_structors_forbidden) << &MTA;

aaron.ballman:

Diag(CMD->getBeginLoc(), diag::note_musttail_structors_forbidden)

<< isa<CXXDestructorDecl>(CMD);

Diag(MTA.getLocation(), diag::note_tail_call_required) << &MTA;

return false;

}

rsmithUnsubmitted

Done

I think this isa<CapturedDecl> check is redundant, because a CapturedDecl is not a FunctionDecl, so CallerDecl will always be null when CurContext is a CapturedDecl.

rsmith: I think this `isa<CapturedDecl>` check is redundant, because a `CapturedDecl` is not a…

if (CMD->isStatic())

Type.MemberType = FuncType::ft_static_member;

else {

Type.This = CMD->getThisType()->getPointeeType();

Type.MemberType = FuncType::ft_non_static_member;

aaron.ballmanUnsubmitted

Done

: QualType();

};

- auto TypesMatch = [this](QualType a, QualType b) -> bool {

- if (a == QualType() || b == QualType()) {

- return a == b;

- } else {

- return Context.hasSimilarType(a, b);

- }

+ auto DoTypesMatch = [this](QualType A, QualType B) {

+ if (A.isNull() || B.isNull())

+ return A == B;

+ return Context.hasSimilarType(A, B);

};

bool types_match =

aaron.ballman:

}

Type.Func = CMD->getType()->castAs<FunctionProtoType>();

aaron.ballmanUnsubmitted

Done

}

};

- bool types_match =

+ bool TypesMatch =

TypesMatch(CallerDecl->getReturnType(), CalleeType->getReturnType()) &&

aaron.ballman:

return true;

};

const auto *CallerDecl = dyn_cast<FunctionDecl>(CurContext);

// Find caller function signature.

aaron.ballmanUnsubmitted

Done

if (types_match) {

- ArrayRef<QualType> callee_params = CalleeType->getParamTypes();

- ArrayRef<ParmVarDecl *> caller_params = CallerDecl->parameters();

+ ArrayRef<QualType> CalleeParams = CalleeType->getParamTypes();

+ ArrayRef<ParmVarDecl *> CallerParams = CallerDecl->parameters();

size_t n = CallerDecl->param_size();

aaron.ballman:

if (!CallerDecl) {

int ContextType;

aaron.ballmanUnsubmitted

Done

ArrayRef<ParmVarDecl *> caller_params = CallerDecl->parameters();

- size_t n = CallerDecl->param_size();

- for (size_t i = 0; i < n; i++) {

+ size_t N = CallerDecl->param_size();

+ for (size_t I = 0; I < N; ++I) {

if (!TypesMatch(callee_params[i], caller_params[i]->getType())) {

How do you want to handle variadic function calls?

aaron.ballman: How do you want to handle variadic function calls?

habermanAuthorUnsubmitted

Done

I added a check to disallow variadic function calls.

haberman: I added a check to disallow variadic function calls.

if (isa<BlockDecl>(CurContext))

ContextType = 0;

else if (isa<ObjCMethodDecl>(CurContext))

aaron.ballmanUnsubmitted

Done

return false;

- } else if (const CXXMethodDecl *CMD = dyn_cast<CXXMethodDecl>(CurContext)) {

+ } else if (const auto *CMD = dyn_cast<CXXMethodDecl>(CurContext)) {

// Caller is a class/struct method.

aaron.ballman:

ContextType = 1;

else

ContextType = 2;

Diag(St->getBeginLoc(), diag::err_musttail_forbidden_from_this_context)

<< &MTA << ContextType;

rsmithUnsubmitted

Done

// Caller is a non-method function.

- CallerType.Func = dyn_cast<FunctionProtoType>(CallerDecl->getType());

+ CallerType.Func = CallerDecl->getType()->getAs<FunctionProtoType>();

}

const Decl *CalleeDecl = CE->getCalleeDecl();

Use getAs rather than dyn_cast to look through type sugar. For example, in

void (f)() { [[clang::musttail]] return f(); }

... the type of f is a ParenType, not a FunctionProtoType.

rsmith: Use `getAs` rather than `dyn_cast` to look through type sugar. For example, in ``` void (f)()…

return false;

} else if (const auto *CMD = dyn_cast<CXXMethodDecl>(CurContext)) {

aaron.ballmanUnsubmitted

Done

Diag(St->getBeginLoc(), diag::err_musttail_return_type_mismatch)

- << MTA.getSpelling();

+ << MTA;

return false;

aaron.ballman:

// Caller is a class/struct method.

if (!GetMethodType(CMD, CallerType, false))

return false;

} else {

// Caller is a non-method function.

CallerType.Func = CallerDecl->getType()->getAs<FunctionProtoType>();

}

rsmithUnsubmitted

Done

You need to use getAs<MemberPointerType> here not isa in order to look through type sugar (eg, typedefs).

However, as noted above, a call via a member pointer doesn't necessarily have a CalleeDecl, so you'll need to do this check by looking for a callee expression that's the right kind of BinaryOperator instead.

rsmith: You need to use `getAs<MemberPointerType>` here not `isa` in order to look through type sugar…

const Expr *CalleeExpr = CE->getCallee()->IgnoreParens();

const auto *CalleeBinOp = dyn_cast<BinaryOperator>(CalleeExpr);

aaron.ballmanUnsubmitted

Done

return false;

- } else if (VD && dyn_cast<MemberPointerType>(VD->getType())) {

+ } else if (const auto *MPT = dyn_cast_or_null<MemberPointerType>(VD ? VD->getType() : nullptr)) {

// Call is: obj->*method_ptr or obj.*method_ptr

- const auto *MPT = VD->getType()->castAs<MemberPointerType>();

CalleeType.This = QualType(MPT->getClass(), 0);

I'm not certain if I should take a shower after writing that code or not, but it's one potential way not to perform the cast twice.

If that code is too odious for others, we should at least change the dyn_cast<> in the else if to be an isa<>.

aaron.ballman: I'm not certain if I should take a shower after writing that code or not, but it's one…

habermanAuthorUnsubmitted

Done

I changed dyn_cast<> to isa<>. If @rsmith concurs about the dyn_cast_or_null<> variant I'll switch to that.

haberman: I changed `dyn_cast<>` to `isa<>`. If @rsmith concurs about the `dyn_cast_or_null<>` variant…

SourceLocation CalleeLoc = CE->getCalleeDecl()

aaron.ballmanUnsubmitted

Done

// Call is: obj->*method_ptr or obj.*method_ptr

- const MemberPointerType *MPT = VD->getType()->castAs<MemberPointerType>();

+ const auto *MPT = VD->getType()->castAs<MemberPointerType>();

CalleeType.This = QualType(MPT->getClass(), 0);

It'd be better not to go through the cast machinery twice -- you cast to the MemberPointerType and then cast to the same thing again (but in a different way).

aaron.ballman: It'd be better not to go through the cast machinery twice -- you cast to the…

habermanAuthorUnsubmitted

Done

I changed to auto, but I can't tell if you have another suggestion here also. I can't see how any of these casts can be removed.

haberman: I changed to `auto`, but I can't tell if you have another suggestion here also. I can't see how…

? CE->getCalleeDecl()->getBeginLoc()

: St->getBeginLoc();

// Find callee function signature.

if (const CXXMethodDecl *CMD =

dyn_cast_or_null<CXXMethodDecl>(CE->getCalleeDecl())) {

// Call is: obj.method(), obj->method(), functor(), etc.

if (!GetMethodType(CMD, CalleeType, true))

return false;

} else if (CalleeBinOp && CalleeBinOp->isPtrMemOp()) {

// Call is: obj->*method_ptr or obj.*method_ptr

rsmithUnsubmitted

Done

Even in invalid code we should never see a CallExpr whose callee has a null type; if Sema can't form an Expr that meets the normal expression invariants during error recovery, it doesn't build one at all. I think you can remove this if.

rsmith: Even in invalid code we should never see a `CallExpr` whose callee has a null type; if `Sema`…

habermanAuthorUnsubmitted

Done

Without this if(), I crash on this test case. What do you think?

struct TestBadPMF {
  int (TestBadPMF::*pmf)();
  void BadPMF() {
    [[clang::musttail]] return ((*this)->*pmf)(); // expected-error {{left hand operand to ->* must be a pointer to class compatible with the right hand operand, but is 'TestBadPMF'}}
  }
};

Dump of CalleeExpr is:

RecoveryExpr 0x106671e8 '<dependent type>' contains-errors lvalue
|-ParenExpr 0x10667020 'struct TestBadPMF' lvalue
| `-UnaryOperator 0x10667008 'struct TestBadPMF' lvalue prefix '*' cannot overflow
|   `-CXXThisExpr 0x10666ff8 'struct TestBadPMF *' this
`-MemberExpr 0x10667050 'int (struct TestBadPMF::*)(void)' lvalue ->pmf 0x10666ed0
  `-CXXThisExpr 0x10667040 'struct TestBadPMF *' implicit this

haberman: Without this if(), I crash on this test case. What do you think? ``` struct TestBadPMF { int…

rsmithUnsubmitted

Done

Ah, right, while the callee will always have a non-null type, that type might not be a pointer type.

I think what we're missing here is a check for a dependent callee; checking for a dependent context isn't enough to check for error-dependent constructs. Probably the simplest thing would be to change the isDependentContext() checks to also check if the return expression isInstantiationDependent(). (That would only help with the error-dependent cases for now, but we'd also need that extra check in the future if anything like http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2277r0.html goes forward, allowing dependent constructs in non-dependent contexts, especially in combination with http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1306r1.pdf.)

rsmith: Ah, right, while the callee will always have a non-null type, that type might not be a pointer…

const auto *MPT =

CalleeBinOp->getRHS()->getType()->castAs<MemberPointerType>();

CalleeType.This = QualType(MPT->getClass(), 0);

CalleeType.Func = MPT->getPointeeType()->castAs<FunctionProtoType>();

CalleeType.MemberType = FuncType::ft_pointer_to_member;

} else if (isa<CXXPseudoDestructorExpr>(CalleeExpr)) {

Diag(St->getBeginLoc(), diag::err_musttail_structors_forbidden)

<< /* IsCallee = */ 1 << /* IsDestructor = */ 1;

Diag(MTA.getLocation(), diag::note_tail_call_required) << &MTA;

return false;

} else {

// Non-method function.

CalleeType.Func =

CalleeExpr->getType()->getPointeeType()->getAs<FunctionProtoType>();

}

// Both caller and callee must have a prototype (no K&R declarations).

if (!CalleeType.Func || !CallerType.Func) {

Diag(St->getBeginLoc(), diag::err_musttail_needs_prototype) << &MTA;

if (!CalleeType.Func && CE->getDirectCallee()) {

Diag(CE->getDirectCallee()->getBeginLoc(),

diag::note_musttail_fix_non_prototype);

}

if (!CallerType.Func)

Diag(CallerDecl->getBeginLoc(), diag::note_musttail_fix_non_prototype);

return false;

}

// Caller and callee must have matching calling conventions.

// Some calling conventions are physically capable of supporting tail calls

// even if the function types don't perfectly match. LLVM is currently too

// strict to allow this, but if LLVM added support for this in the future, we

// could exit early here and skip the remaining checks if the functions are

// using such a calling convention.

if (CallerType.Func->getCallConv() != CalleeType.Func->getCallConv()) {

if (const auto *ND = dyn_cast_or_null<NamedDecl>(CE->getCalleeDecl()))

Diag(St->getBeginLoc(), diag::err_musttail_callconv_mismatch)

<< true << ND->getDeclName();

else

Diag(St->getBeginLoc(), diag::err_musttail_callconv_mismatch) << false;

Diag(CalleeLoc, diag::note_musttail_callconv_mismatch)

<< FunctionType::getNameForCallConv(CallerType.Func->getCallConv())

<< FunctionType::getNameForCallConv(CalleeType.Func->getCallConv());

Diag(MTA.getLocation(), diag::note_tail_call_required) << &MTA;

return false;

}

if (CalleeType.Func->isVariadic() || CallerType.Func->isVariadic()) {

Diag(St->getBeginLoc(), diag::err_musttail_no_variadic) << &MTA;

return false;

}

// Caller and callee must match in whether they have a "this" parameter.

if (CallerType.This.isNull() != CalleeType.This.isNull()) {

if (const auto *ND = dyn_cast_or_null<NamedDecl>(CE->getCalleeDecl())) {

Diag(St->getBeginLoc(), diag::err_musttail_member_mismatch)

<< CallerType.MemberType << CalleeType.MemberType << true

<< ND->getDeclName();

Diag(CalleeLoc, diag::note_musttail_callee_defined_here)

rsmithUnsubmitted

Done

if (!Context.hasSimilarType(A, B)) {

- PD << Select << A << B;

+ PD << Select << A.getUnqualifiedType() << B.getUnqualifiedType();

return false;

Given that we don't care about differences in qualifiers, it might be clearer to not include them in the diagnostics.

rsmith: Given that we don't care about differences in qualifiers, it might be clearer to not include…

<< ND->getDeclName();

} else

Diag(St->getBeginLoc(), diag::err_musttail_member_mismatch)

<< CallerType.MemberType << CalleeType.MemberType << false;

Diag(MTA.getLocation(), diag::note_tail_call_required) << &MTA;

return false;

}

auto CheckTypesMatch = [this](FuncType CallerType, FuncType CalleeType,

PartialDiagnostic &PD) -> bool {

enum {

ft_different_class,

ft_parameter_arity,

ft_parameter_mismatch,

ft_return_type,

};

auto DoTypesMatch = [this, &PD](QualType A, QualType B,

unsigned Select) -> bool {

if (!Context.hasSimilarType(A, B)) {

PD << Select << A.getUnqualifiedType() << B.getUnqualifiedType();

return false;

}

return true;

};

if (!CallerType.This.isNull() &&

!DoTypesMatch(CallerType.This, CalleeType.This, ft_different_class))

return false;

if (!DoTypesMatch(CallerType.Func->getReturnType(),

CalleeType.Func->getReturnType(), ft_return_type))

return false;

if (CallerType.Func->getNumParams() != CalleeType.Func->getNumParams()) {

PD << ft_parameter_arity << CallerType.Func->getNumParams()

<< CalleeType.Func->getNumParams();

return false;

}

ArrayRef<QualType> CalleeParams = CalleeType.Func->getParamTypes();

ArrayRef<QualType> CallerParams = CallerType.Func->getParamTypes();

size_t N = CallerType.Func->getNumParams();

for (size_t I = 0; I < N; I++) {

if (!DoTypesMatch(CalleeParams[I], CallerParams[I],

ft_parameter_mismatch)) {

PD << static_cast<int>(I) + 1;

return false;

}

return true;

};

PartialDiagnostic PD = PDiag(diag::note_musttail_mismatch);

if (!CheckTypesMatch(CallerType, CalleeType, PD)) {

if (const auto *ND = dyn_cast_or_null<NamedDecl>(CE->getCalleeDecl()))

Diag(St->getBeginLoc(), diag::err_musttail_mismatch)

<< true << ND->getDeclName();

else

Diag(St->getBeginLoc(), diag::err_musttail_mismatch) << false;

Diag(CalleeLoc, PD);

Diag(MTA.getLocation(), diag::note_tail_call_required) << &MTA;

return false;

}

return true;

}

namespace { namespace {

class CommaVisitor : public EvaluatedExprVisitor<CommaVisitor> { class CommaVisitor : public EvaluatedExprVisitor<CommaVisitor> {

typedef EvaluatedExprVisitor<CommaVisitor> Inherited; typedef EvaluatedExprVisitor<CommaVisitor> Inherited;

Sema &SemaRef; Sema &SemaRef;

public: public:

CommaVisitor(Sema &SemaRef) : Inherited(SemaRef.Context), SemaRef(SemaRef) {} CommaVisitor(Sema &SemaRef) : Inherited(SemaRef.Context), SemaRef(SemaRef) {}

void VisitBinaryOperator(BinaryOperator *E) { void VisitBinaryOperator(BinaryOperator *E) {

if (E->getOpcode() == BO_Comma) if (E->getOpcode() == BO_Comma)

▲ Show 20 Lines • Show All 3,960 Lines • Show Last 20 Lines

clang/lib/Sema/SemaStmtAttr.cpp

//===--- SemaStmtAttr.cpp - Statement Attribute Handling ------------------===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

// This file implements stmt-related attribute processing.

//===----------------------------------------------------------------------===//

#include "clang/AST/ASTContext.h"

#include "clang/AST/EvaluatedExprVisitor.h"

#include "clang/Basic/SourceManager.h"

rsmithUnsubmitted

Done

Looks like you're not using this. (And that's good: the parent map should never be used in the static compiler; it's a tooling-only facility.)

rsmith: Looks like you're not using this. (And that's good: the parent map should never be used in the…

#include "clang/Basic/TargetInfo.h"

#include "clang/Sema/DelayedDiagnostic.h"

#include "clang/Sema/Lookup.h"

#include "clang/Sema/ScopeInfo.h"

#include "clang/Sema/SemaInternal.h"

#include "llvm/ADT/StringExtras.h"

using namespace clang;

▲ Show 20 Lines • Show All 180 Lines • ▼ Show 20 Lines

if (!CEF.foundCallExpr()) {

S.Diag(St->getBeginLoc(), diag::warn_nomerge_attribute_ignored_in_stmt)

<< NMA.getSpelling();

return nullptr;

}

return ::new (S.Context) NoMergeAttr(S.Context, A);

}

static Attr *handleMustTailAttr(Sema &S, Stmt *St, const ParsedAttr &A,

SourceRange Range) {

// Validation is in Sema::ActOnAttributedStmt().

aaron.ballmanUnsubmitted

Done

This can be removed entirely.

aaron.ballman: This can be removed entirely.

return ::new (S.Context) MustTailAttr(S.Context, A);

}

aaron.ballmanUnsubmitted

Done

SourceRange Range) {

- MustTailAttr MTA(S.Context, A);

- if (S.CheckAttrNoArgs(A))

- return nullptr;

// Validation is in Sema::ActOnAttributedStmt().

None of this should be needed due to the automated diagnostic checking.

aaron.ballman: None of this should be needed due to the automated diagnostic checking.

static Attr *handleLikely(Sema &S, Stmt *St, const ParsedAttr &A,

SourceRange Range) {

if (!S.getLangOpts().CPlusPlus20 && A.isCXX11Attribute() && !A.getScopeName())

S.Diag(A.getLoc(), diag::ext_cxx20_attr) << A << Range;

return ::new (S.Context) LikelyAttr(S.Context, A);

}

▲ Show 20 Lines • Show All 187 Lines • ▼ Show 20 Lines

static Attr *ProcessStmtAttribute(Sema &S, Stmt *St, const ParsedAttr &A,

case ParsedAttr::AT_LoopHint:

return handleLoopHintAttr(S, St, A, Range);

case ParsedAttr::AT_OpenCLUnrollHint:

return handleOpenCLUnrollHint(S, St, A, Range);

case ParsedAttr::AT_Suppress:

return handleSuppressAttr(S, St, A, Range);

case ParsedAttr::AT_NoMerge:

return handleNoMergeAttr(S, St, A, Range);

case ParsedAttr::AT_MustTail:

return handleMustTailAttr(S, St, A, Range);

case ParsedAttr::AT_Likely:

return handleLikely(S, St, A, Range);

case ParsedAttr::AT_Unlikely:

return handleUnlikely(S, St, A, Range);

default:

// N.B., ClangAttrEmitter.cpp emits a diagnostic helper that ensures a

// declaration attribute is not written on a statement, but this code is

// needed for attributes in Attr.td that do not list any subjects.

Show All 16 Lines

clang/test/CodeGenCXX/attr-musttail.cpp

This file was added.

// RUN: %clang_cc1 -fno-elide-constructors -S -emit-llvm %s -triple x86_64-unknown-linux-gnu -o - | FileCheck %s

// RUN: %clang_cc1 -fno-elide-constructors -S -emit-llvm %s -triple x86_64-unknown-linux-gnu -o - | opt -verify

// FIXME: remove the call to "opt" once the tests are running the Clang verifier automatically again.

int Bar(int);

int Baz(int);

int Func1(int x) {

if (x) {

// CHECK: %call = musttail call i32 @_Z3Bari(i32 %1)

// CHECK-NEXT: ret i32 %call

[[clang::musttail]] return Bar(x);

} else {

[[clang::musttail]] return Baz(x); // CHECK: %call1 = musttail call i32 @_Z3Bazi(i32 %3)

}

int Func2(int x) {

{

[[clang::musttail]] return Bar(Bar(x));

}

// CHECK: %call1 = musttail call i32 @_Z3Bari(i32 %call)

class Foo {

public:

static int StaticMethod(int x);

int MemberFunction(int x);

int TailFrom(int x);

int TailFrom2(int x);

int TailFrom3(int x);

};

int Foo::TailFrom(int x) {

[[clang::musttail]] return MemberFunction(x);

}

// CHECK: %call = musttail call i32 @_ZN3Foo14MemberFunctionEi(%class.Foo* nonnull dereferenceable(1) %this1, i32 %0)

int Func3(int x) {

[[clang::musttail]] return Foo::StaticMethod(x);

}

// CHECK: %call = musttail call i32 @_ZN3Foo12StaticMethodEi(i32 %0)

int Func4(int x) {

Foo foo; // Object with trivial destructor.

[[clang::musttail]] return foo.StaticMethod(x);

}

// CHECK: %call = musttail call i32 @_ZN3Foo12StaticMethodEi(i32 %0)

int (Foo::*pmf)(int);

int Foo::TailFrom2(int x) {

[[clang::musttail]] return ((*this).*pmf)(x);

}

// CHECK: %call = musttail call i32 %8(%class.Foo* nonnull dereferenceable(1) %this.adjusted, i32 %9)

int Foo::TailFrom3(int x) {

[[clang::musttail]] return (this->*pmf)(x);

}

// CHECK: %call = musttail call i32 %8(%class.Foo* nonnull dereferenceable(1) %this.adjusted, i32 %9)

void ReturnsVoid();

void Func5() {

[[clang::musttail]] return ReturnsVoid();

}

// CHECK: musttail call void @_Z11ReturnsVoidv()

class HasTrivialDestructor {};

int ReturnsInt(int x);

int Func6(int x) {

HasTrivialDestructor foo;

[[clang::musttail]] return ReturnsInt(x);

}

// CHECK: %call = musttail call i32 @_Z10ReturnsInti(i32 %0)

struct Data {

int (*fptr)(Data *);

};

int Func7(Data *data) {

[[clang::musttail]] return data->fptr(data);

}

// CHECK: %call = musttail call i32 %1(%struct.Data* %2)

template <class T>

T TemplateFunc(T) {

return 5;

}

int Func9(int x) {

[[clang::musttail]] return TemplateFunc<int>(x);

}

// CHECK: %call = musttail call i32 @_Z12TemplateFuncIiET_S0_(i32 %0)

template <class T>

int Func10(int x) {

T t;

[[clang::musttail]] return Bar(x);

}

int Func11(int x) {

return Func10<int>(x);

}

// CHECK: %call = musttail call i32 @_Z3Bari(i32 %0)

template <class T>

T Func12(T x) {

[[clang::musttail]] return ::Bar(x);

}

int Func13(int x) {

return Func12<int>(x);

}

// CHECK: %call = musttail call i32 @_Z3Bari(i32 %0)

int Func14(int x) {

int vla[x];

[[clang::musttail]] return Bar(x);

}

// CHECK: %call = musttail call i32 @_Z3Bari(i32 %3)

void TrivialDestructorParam(HasTrivialDestructor obj);

void Func14(HasTrivialDestructor obj) {

[[clang::musttail]] return TrivialDestructorParam(obj);

}

// CHECK: musttail call void @_Z22TrivialDestructorParam20HasTrivialDestructor()

struct Struct3 {

void ConstMemberFunction(const int *) const;

void NonConstMemberFunction(int *i);

};

void Struct3::NonConstMemberFunction(int *i) {

// The parameters are not identical, but they are compatible.

[[clang::musttail]] return ConstMemberFunction(i);

}

// CHECK: musttail call void @_ZNK7Struct319ConstMemberFunctionEPKi(%struct.Struct3* nonnull dereferenceable(1) %this1, i32* %0)

struct HasNonTrivialCopyConstructor {

HasNonTrivialCopyConstructor(const HasNonTrivialCopyConstructor &);

};

HasNonTrivialCopyConstructor ReturnsClassByValue();

HasNonTrivialCopyConstructor TestNonElidableCopyConstructor() {

[[clang::musttail]] return (((ReturnsClassByValue())));

}

// CHECK: musttail call void @_Z19ReturnsClassByValuev(%struct.HasNonTrivialCopyConstructor* sret(%struct.HasNonTrivialCopyConstructor) align 1 %agg.result)

struct HasNonTrivialCopyConstructor2 {

// Copy constructor works even if it has extra default params.

HasNonTrivialCopyConstructor2(const HasNonTrivialCopyConstructor &, int DefaultParam = 5);

};

HasNonTrivialCopyConstructor2 ReturnsClassByValue2();

HasNonTrivialCopyConstructor2 TestNonElidableCopyConstructor2() {

[[clang::musttail]] return (((ReturnsClassByValue2())));

}

// CHECK: musttail call void @_Z20ReturnsClassByValue2v()

void TestFunctionPointer(int x) {

void (*p)(int) = nullptr;

[[clang::musttail]] return p(x);

}

// CHECK: musttail call void %0(i32 %1)

struct LargeWithCopyConstructor {

LargeWithCopyConstructor(const LargeWithCopyConstructor &);

char data[32];

};

LargeWithCopyConstructor ReturnsLarge();

LargeWithCopyConstructor TestLargeWithCopyConstructor() {

[[clang::musttail]] return ReturnsLarge();

}

// CHECK: define dso_local void @_Z28TestLargeWithCopyConstructorv(%struct.LargeWithCopyConstructor* noalias sret(%struct.LargeWithCopyConstructor) align 1 %agg.result)

// CHECK: musttail call void @_Z12ReturnsLargev(%struct.LargeWithCopyConstructor* sret(%struct.LargeWithCopyConstructor) align 1 %agg.result)

using IntFunctionType = int();

IntFunctionType *ReturnsIntFunction();

int TestRValueFunctionPointer() {

[[clang::musttail]] return ReturnsIntFunction()(); // expected-error{{'musttail' attribute requires that caller and callee have compatible function signatures}} // expected-note{{target function has different return type ('long' expected but has 'int')}}

}

// CHECK: musttail call i32 %call()

void(FuncWithParens)() {

[[clang::musttail]] return FuncWithParens();

}

// CHECK: musttail call i32 %call()

int TestNonCapturingLambda() {

auto lambda = []() { return 12; }; // expected-note {{target function is a member function of class 'const (lambda}}

[[clang::musttail]] return (+lambda)();

rsmithUnsubmitted

Done

int TestNonCapturingLambda() {

- auto lambda = []() { return 12; }; // expected-note {{target function is a member function of class 'const (lambda at /usr/local/google/home/haberman/code/llvm-project/clang/test/SemaCXX/attr-musttail.cpp:142:17)'}}

+ auto lambda = []() { return 12; }; // expected-note {{target function is a member function of class 'const (lambda}}

[[clang::musttail]] return (+lambda)();

rsmith:

}

clang/test/Sema/attr-musttail.c

This file was added.

				// RUN: %clang_cc1 -verify -fsyntax-only %s

				int NotAProtoType(); // expected-note{{add 'void' to the parameter list to turn an old-style K&R function declaration into a prototype}}
				int TestCalleeNotProtoType(void) {
				__attribute__((musttail)) return NotAProtoType(); // expected-error{{'musttail' attribute requires that both caller and callee functions have a prototype}}
				}

				int ProtoType(void);
				int TestCallerNotProtoType() { // expected-note{{add 'void' to the parameter list to turn an old-style K&R function declaration into a prototype}}
				__attribute__((musttail)) return ProtoType(); // expected-error{{'musttail' attribute requires that both caller and callee functions have a prototype}}
				}

				int TestProtoType(void) {
				return ProtoType();
				}

clang/test/Sema/attr-musttail.m

This file was added.

				// RUN: %clang_cc1 -fsyntax-only -fblocks -Wno-objc-root-class -verify %s

				void TestObjcBlock(void) {
				void (^x)(void) = ^(void) {
				__attribute__((musttail)) return TestObjcBlock(); // expected-error{{'musttail' attribute cannot be used from a block}}
				};
				__attribute__((musttail)) return x();
				}

				void ReturnsVoid(void);
				void TestObjcBlockVar(void) {
				__block int i = 0; // expected-note{{jump exits scope of __block variable}}
				__attribute__((musttail)) return ReturnsVoid(); // expected-error{{cannot perform a tail call from this return statement}}
				}

				__attribute__((objc_root_class))
				@interface TestObjcClass
				@end

				@implementation TestObjcClass

				- (void)testObjCMethod {
				__attribute__((musttail)) return ReturnsVoid(); // expected-error{{'musttail' attribute cannot be used from an Objective-C function}}
				}

				@end

clang/test/SemaCXX/attr-musttail.cpp

This file was added.

				// RUN: %clang_cc1 -verify -fsyntax-only -fms-extensions -fcxx-exceptions -fopenmp %s

				int ReturnsInt1();
				int Func1() {
				[[clang::musttail]] ReturnsInt1(); // expected-error {{'musttail' attribute only applies to return statements}}
				[[clang::musttail(1, 2)]] return ReturnsInt1(); // expected-error {{'musttail' attribute takes no arguments}}
				[[clang::musttail]] return 5; // expected-error {{'musttail' attribute requires that the return value is the result of a function call}}
				[[clang::musttail]] return ReturnsInt1();
				}

				void NoFunctionCall() {
				[[clang::musttail]] return; // expected-error {{'musttail' attribute requires that the return value is the result of a function call}}
				}

				[[clang::musttail]] static int int_val = ReturnsInt1(); // expected-error {{'musttail' attribute cannot be applied to a declaration}}

				void NoParams(); // expected-note {{target function has different number of parameters (expected 1 but has 0)}}
				void TestParamArityMismatch(int x) {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return NoParams(); // expected-error {{cannot perform a tail call to function 'NoParams' because its signature is incompatible with the calling function}}
				}

				void LongParam(long x); // expected-note {{target function has type mismatch at 1st parameter (expected 'long' but has 'int')}}
				void TestParamTypeMismatch(int x) {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return LongParam(x); // expected-error {{cannot perform a tail call to function 'LongParam' because its signature is incompatible with the calling function}}
				}

				long ReturnsLong(); // expected-note {{target function has different return type ('int' expected but has 'long')}}
				int TestReturnTypeMismatch() {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return ReturnsLong(); // expected-error {{cannot perform a tail call to function 'ReturnsLong' because its signature is incompatible with the calling function}}
				}

				struct Struct1 {
				void MemberFunction(); // expected-note {{'MemberFunction' declared here}}
				};
				void TestNonMemberToMember() {
				Struct1 st;
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return st.MemberFunction(); // expected-error {{non-member function cannot perform a tail call to non-static member function 'MemberFunction'}}
				}

				void ReturnsVoid(); // expected-note {{'ReturnsVoid' declared here}}
				struct Struct2 {
				void TestMemberToNonMember() {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return ReturnsVoid(); // expected-error{{non-static member function cannot perform a tail call to non-member function 'ReturnsVoid'}}
				}
				};

				class HasNonTrivialDestructor {
				public:
				~HasNonTrivialDestructor() {}
				int ReturnsInt();
				};

				void ReturnsVoid2();
				void TestNonTrivialDestructorInScope() {
				HasNonTrivialDestructor foo; // expected-note {{jump exits scope of variable with non-trivial destructor}}
				[[clang::musttail]] return ReturnsVoid(); // expected-error {{cannot perform a tail call from this return statement}}
				}

				int NonTrivialParam(HasNonTrivialDestructor x);
				int TestNonTrivialParam(HasNonTrivialDestructor x) {
				[[clang::musttail]] return NonTrivialParam(x); // expected-error {{tail call requires that the return value, all parameters, and any temporaries created by the expression are trivially destructible}}
				}

				HasNonTrivialDestructor ReturnsNonTrivialValue();
				HasNonTrivialDestructor TestReturnsNonTrivialValue() {
				// FIXME: the diagnostic cannot currently distinguish between needing to run a
				// destructor for the return value and needing to run a destructor for some
				// other temporary created in the return statement.
				[[clang::musttail]] return (ReturnsNonTrivialValue()); // expected-error {{tail call requires that the return value, all parameters, and any temporaries created by the expression are trivially destructible}}
				}

				HasNonTrivialDestructor TestReturnsNonTrivialNonFunctionCall() {
				[[clang::musttail]] return HasNonTrivialDestructor(); // expected-error {{'musttail' attribute requires that the return value is the result of a function call}}
				}

				struct UsesPointerToMember {
				void (UsesPointerToMember::*p_mem)(); // expected-note {{'p_mem' declared here}}
				};
				void TestUsesPointerToMember(UsesPointerToMember *foo) {
				// "this" pointer cannot double as first parameter.
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return (foo->*(foo->p_mem))(); // expected-error {{non-member function cannot perform a tail call to pointer-to-member function 'p_mem'}}
				}

				void ReturnsVoid2();
				void TestNestedClass() {
				HasNonTrivialDestructor foo;
				class Nested {
				__attribute__((noinline)) static void NestedMethod() {
				// Outer non-trivial destructor does not affect nested class.
				[[clang::musttail]] return ReturnsVoid2();
				}
				};
				}

				template <class T>
				T TemplateFunc(T x) { // expected-note{{target function has different return type ('long' expected but has 'int')}}
				return x ? 5 : 10;
				}
				int OkTemplateFunc(int x) {
				[[clang::musttail]] return TemplateFunc<int>(x);
				}
				template <class T>
				T BadTemplateFunc(T x) {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return TemplateFunc<int>(x); // expected-error {{cannot perform a tail call to function 'TemplateFunc' because its signature is incompatible with the calling function}}
				}
				long TestBadTemplateFunc(long x) {
				return BadTemplateFunc<long>(x); // expected-note {{in instantiation of}}
				}

				void IntParam(int x);
				void TestVLA(int x) {
				HasNonTrivialDestructor vla[x]; // expected-note {{jump exits scope of variable with non-trivial destructor}}
				[[clang::musttail]] return IntParam(x); // expected-error {{cannot perform a tail call from this return statement}}
				}

				void TestNonTrivialDestructorSubArg(int x) {
				[[clang::musttail]] return IntParam(NonTrivialParam(HasNonTrivialDestructor())); // expected-error {{tail call requires that the return value, all parameters, and any temporaries created by the expression are trivially destructible}}
				}

				void VariadicFunction(int x, ...);
				void TestVariadicFunction(int x, ...) {
				[[clang::musttail]] return VariadicFunction(x); // expected-error {{'musttail' attribute may not be used with variadic functions}}
				}

				int TakesIntParam(int x); // expected-note {{target function has type mismatch at 1st parameter (expected 'int' but has 'short')}}
				int TakesShortParam(short x); // expected-note {{target function has type mismatch at 1st parameter (expected 'short' but has 'int')}}
				int TestIntParamMismatch(int x) {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return TakesShortParam(x); // expected-error {{cannot perform a tail call to function 'TakesShortParam' because its signature is incompatible with the calling function}}
				}
				int TestIntParamMismatch2(short x) {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return TakesIntParam(x); // expected-error {{cannot perform a tail call to function 'TakesIntParam' because its signature is incompatible with the calling function}}
				}

				struct TestClassMismatch1 {
				void ToFunction(); // expected-note{{target function is a member of different class (expected 'TestClassMismatch2' but has 'TestClassMismatch1')}}
				};
				TestClassMismatch1 *tcm1;
				struct TestClassMismatch2 {
				void FromFunction() {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return tcm1->ToFunction(); // expected-error {{cannot perform a tail call to function 'ToFunction' because its signature is incompatible with the calling function}}
				}
				};

				__regcall int RegCallReturnsInt(); // expected-note {{target function has calling convention regcall (expected cdecl)}}
				int TestMismatchCallingConvention() {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return RegCallReturnsInt(); // expected-error {{cannot perform a tail call to function 'RegCallReturnsInt' because it uses an incompatible calling convention}}
				}

				int TestNonCapturingLambda() {
				auto lambda = []() { return 12; }; // expected-note {{'operator()' declared here}}
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return lambda(); // expected-error {{non-member function cannot perform a tail call to non-static member function 'operator()'}}

				// This works.
				auto lambda_fptr = static_cast<int (*)()>(lambda);
				[[clang::musttail]] return lambda_fptr();
				[[clang::musttail]] return (+lambda)();
				}

				int TestCapturingLambda() {
				int x;
				auto lambda = [x]() { return 12; }; // expected-note {{'operator()' declared here}}
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return lambda(); // expected-error {{non-member function cannot perform a tail call to non-static member function 'operator()'}}
				}

				int TestNonTrivialTemporary(int) {
				[[clang::musttail]] return TakesIntParam(HasNonTrivialDestructor().ReturnsInt()); // expected-error {{tail call requires that the return value, all parameters, and any temporaries created by the expression are trivially destructible}}
				}

				void ReturnsVoid();
				struct TestDestructor {
				~TestDestructor() {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return ReturnsVoid(); // expected-error {{destructor '~TestDestructor' must not return void expression}} // expected-error {{cannot perform a tail call from a destructor}}
				}
				};

				struct ClassWithDestructor { // expected-note {{target destructor is declared here}}
				void TestExplicitDestructorCall() {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return this->~ClassWithDestructor(); // expected-error {{cannot perform a tail call to a destructor}}
				}
				};

				struct HasNonTrivialCopyConstructor {
				HasNonTrivialCopyConstructor(const HasNonTrivialCopyConstructor &);
				};
				HasNonTrivialCopyConstructor ReturnsClassByValue();
				HasNonTrivialCopyConstructor TestNonElidableCopyConstructor() {
				// This is an elidable constructor, but when it is written explicitly
				// we decline to elide it.
				[[clang::musttail]] return HasNonTrivialCopyConstructor(ReturnsClassByValue()); // expected-error{{'musttail' attribute requires that the return value is the result of a function call}}
				}

				struct ClassWithConstructor {
				ClassWithConstructor() = default; // expected-note {{target constructor is declared here}}
				};
				void TestExplicitConstructorCall(ClassWithConstructor a) {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return a.ClassWithConstructor::ClassWithConstructor(); // expected-error{{cannot perform a tail call to a constructor}} expected-warning{{explicit constructor calls are a Microsoft extension}}
				}

				void TestStatementExpression() {
				({
				HasNonTrivialDestructor foo; // expected-note {{jump exits scope of variable with non-trivial destructor}}
				[[clang::musttail]] return ReturnsVoid2(); // expected-error {{cannot perform a tail call from this return statement}}
				});
				}

				struct MyException {};
				void TestTryBlock() {
				try { // expected-note {{jump exits try block}}
				[[clang::musttail]] return ReturnsVoid2(); // expected-error {{cannot perform a tail call from this return statement}}
				} catch (MyException &e) {
				}
				}

				using IntFunctionType = int();
				IntFunctionType *ReturnsIntFunction();
				long TestRValueFunctionPointer() {
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return ReturnsIntFunction()(); // expected-error{{cannot perform a tail call to function because its signature is incompatible with the calling function}} // expected-note{{target function has different return type ('long' expected but has 'int')}}
				}

				void TestPseudoDestructor() {
				int n;
				using T = int;
				[[clang::musttail]] // expected-note {{tail call required by 'musttail' attribute here}}
				return n.~T(); // expected-error{{cannot perform a tail call to a destructor}}
				}

				struct StructPMF {
				typedef void (StructPMF::*PMF)();
				static void TestReturnsPMF();
				};

				StructPMF *St;
				StructPMF::PMF ReturnsPMF();
				void StructPMF::TestReturnsPMF() {
				[[clang::musttail]] // expected-note{{tail call required by 'musttail' attribute here}}
				return (St->*ReturnsPMF())(); // expected-error{{static member function cannot perform a tail call to pointer-to-member function}}
				}

				// These tests are merely verifying that we don't crash with incomplete or
				// erroneous ASTs. These cases crashed the compiler in early iterations.

				struct TestBadPMF {
				int (TestBadPMF::*pmf)();
				void BadPMF() {
				[[clang::musttail]] return ((this)->pmf)(); // expected-error {{left hand operand to ->* must be a pointer to class compatible with the right hand operand, but is 'TestBadPMF'}}
				}
				};

				namespace ns {}
				void TestCallNonValue() {
				[[clang::musttail]] return ns; // expected-error {{unexpected namespace name 'ns': expected expression}}
				}

This is an archive of the discontinued LLVM Phabricator instance.

Implemented [[clang::musttail]] attribute for guaranteed tail calls.ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 337581

clang/include/clang/AST/IgnoreExpr.h

clang/include/clang/Basic/Attr.td

clang/include/clang/Basic/AttrDocs.td

clang/include/clang/Basic/DiagnosticSemaKinds.td

clang/include/clang/Sema/ScopeInfo.h

clang/include/clang/Sema/Sema.h

clang/lib/CodeGen/CGCall.cpp

clang/lib/CodeGen/CGClass.cpp

clang/lib/CodeGen/CGDecl.cpp

clang/lib/CodeGen/CGExpr.cpp

clang/lib/CodeGen/CGExprCXX.cpp

clang/lib/CodeGen/CGStmt.cpp

clang/lib/CodeGen/CodeGenFunction.h

clang/lib/CodeGen/EHScopeStack.h

clang/lib/Sema/JumpDiagnostics.cpp

clang/lib/Sema/Sema.cpp

clang/lib/Sema/SemaStmt.cpp

clang/lib/Sema/SemaStmtAttr.cpp

clang/test/CodeGenCXX/attr-musttail.cpp

clang/test/Sema/attr-musttail.c

clang/test/Sema/attr-musttail.m

clang/test/SemaCXX/attr-musttail.cpp

Implemented [[clang::musttail]] attribute for guaranteed tail calls.
ClosedPublic