This is an archive of the discontinued LLVM Phabricator instance.

[Attr] Introduce [[clang::nonportable_musttail]] as less strict version of [[clang::musttail]]
Needs RevisionPublic

Authored by xbolva00 on Apr 6 2023, 8:16 AM.

Details

Summary

As noted in https://github.com/llvm/llvm-project/issues/54964 , musttail provide assurances that the tail call can be optimized on all targets.

Some users cares about only one specific target and if they use musttail attribute, they may see an Clang error about signatures mismatch. This mismatch is often a not a real problem in practice and LLVM can still emit a tail call.

So... Introduce [[clang::nonportable_musttail]] which is a less strict version of [[clang::musttail]] (no checking if signatures fully match).

Fixes https://github.com/llvm/llvm-project/issues/54964

Diff Detail

Event Timeline

xbolva00 created this revision.Apr 6 2023, 8:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2023, 8:16 AM
xbolva00 requested review of this revision.Apr 6 2023, 8:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2023, 8:16 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
xbolva00 edited the summary of this revision. (Show Details)Apr 6 2023, 8:17 AM
xbolva00 edited the summary of this revision. (Show Details)
xbolva00 edited the summary of this revision. (Show Details)
xbolva00 updated this revision to Diff 511427.Apr 6 2023, 8:33 AM

Mention it in attribute documentation.

Any thoughts on diagnostics here? If I'm not mistaken, with this patch, if you request an impossible tail call, you get a crash with very little useful information. (Although, see https://discourse.llvm.org/t/rfc-improving-clangs-middle-and-back-end-diagnostics/69261 )

xbolva00 added a comment.EditedApr 6 2023, 10:59 AM

Any thoughts on diagnostics here? If I'm not mistaken, with this patch, if you request an impossible tail call, you get a crash with very little useful information. (Although, see https://discourse.llvm.org/t/rfc-improving-clangs-middle-and-back-end-diagnostics/69261 )

Yes, something like:

fatal error: error in backend: failed to perform tail call elimination on a call site marked musttail

It seems it would be possible to return also "reason" from IsEligibleForTailCallOptimization. On the other hand, IsEligibleForTailCallOptimization is implemented by many backends, so quite a lot of work, so for now maybe we could improve it atleast for X86 / ARM? I could do it for X86.

An error message pointing at the call in the source code would be good enough, I think. Adding a "reason" might be nice, but not strictly necessary.

xbolva00 updated this revision to Diff 511498.Apr 6 2023, 11:52 AM

Added location when emiting diagnostic about failed TCO in the backend.

Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2023, 11:52 AM

An error message pointing at the call in the source code would be good enough, I think. Adding a "reason" might be nice, but not strictly necessary.

Good idea. So now for:

short i(void);
int j(void) {
    [[clang::nonportable_musttail]]
    return i();
}

Clang / LLVM on X86 emits:

example.cpp:2:5: error: failed to perform tail call elimination on a call site marked musttail
int j(void) {

^

With debug info:

example.cpp:4:12: error: failed to perform tail call elimination on a call site marked musttail

return i();
       ^

Nice improvement for end users.

So if no concerns, I will do necessary modifications for other backends.

Correct me if I am wrong, but won't this produce LLVM IR that fails to validate?

The checks that exist today for [[clang::musttail]] are necessary to follow LLVM's rules about where musttail may appear: https://llvm.org/docs/LangRef.html#id328

Calls marked musttail must obey the following additional rules:

  • The call must immediately precede a ret instruction, or a pointer bitcast followed by a ret instruction.
  • The ret instruction must return the (possibly bitcasted) value produced by the call, undef, or void.
  • The calling conventions of the caller and callee must match.
  • The callee must be varargs iff the caller is varargs. Bitcasting a non-varargs function to the appropriate varargs type is legal so long as the non-varargs prefixes obey the other rules.
  • The return type must not undergo automatic conversion to an sret pointer.

In addition, if the calling convention is not swifttailcc or tailcc:

  • All ABI-impacting function attributes, such as sret, byval, inreg, returned, and inalloca, must match.
  • The caller and callee prototypes must match. Pointer types of parameters or return types may differ in pointee type, but not in address space.

It looks like this change would allow users to produce LLVM IR that violates that final condition:

The caller and callee prototypes must match. Pointer types of parameters or return types may differ in pointee type, but not in address space.

xbolva00 added a comment.EditedApr 6 2023, 2:28 PM

Possibly verifier ignores it? Because I dont see any verifier error compiling this testcase:

void NoParams();
void TestParamArityMismatchNonPortable(int x) {
  [[clang::nonportable_musttail]]
  return NoParams();
}

But yeah, this is something we need to solve somehow. Maybe just remove this one sentence or modify it...

Added location when emiting diagnostic about failed TCO in the backend.

Will the diagnostic behavior change based on optimization level? Will the diagnostic be emitted to the correct location if there's no debug info? Are there frontend tests for this diagnostic?

Higher-level question: are we sure there's enough evidence this will be used in practice to warrant adding it to Clang?

Yeah, I think the diagnostics result of this attribute are unacceptable. I understand the goal here, but I suspect the more ergonomic (albeit, perhaps surprising) is a [[should_tail]] attribute sort of thing: a tail-hint where we do 'best effort with no promises', and make no guarantees that we're going to tail it. I'm sure this would require more work in the backends + require an additional IR tag, but even that seems difficult to use properly.

Could we instead encode the platform specific tail-call rules in Sema? We definitely have attributes that differ in behavior between platforms, and this seems like a perfect candidate.

Could we instead encode the platform specific tail-call rules in Sema? We definitely have attributes that differ in behavior between platforms, and this seems like a perfect candidate.

But then you break promise of musttail that code with musttail will work on any target supported by LLVM.

Plus, while we can relax frontend checks (based on target info), it is true that current LLVM LangRef says that there must be a match between prototypes.

Higher-level question: are we sure there's enough evidence this will be used in practice to warrant adding it to Clang?

Originally mentioned as request for CTRE project, I have use cases as well, @jacobsa (Github issue) mentioned it:

Personally I would like the semantics "give me an error if this can't be a tail call on this target", not the semantics "give me an error if this can't be a tail call on all targets clang supports". I suspect most people would be looking for the former, since most people care about nearly none of the supported targets by volume.

I agree with that.

Could we instead encode the platform specific tail-call rules in Sema? We definitely have attributes that differ in behavior between platforms, and this seems like a perfect candidate.

But then you break promise of musttail that code with musttail will work on any target supported by LLVM.

I should have been clearer: I meant this with a new attribute.

Plus, while we can relax frontend checks (based on target info), it is true that current LLVM LangRef says that there must be a match between prototypes.

Right, this would require its own llvm-level concept as well.

Higher-level question: are we sure there's enough evidence this will be used in practice to warrant adding it to Clang?

Originally mentioned as request for CTRE project, I have use cases as well, @jacobsa (Github issue) mentioned it:

Personally I would like the semantics "give me an error if this can't be a tail call on this target", not the semantics "give me an error if this can't be a tail call on all targets clang supports". I suspect most people would be looking for the former, since most people care about nearly none of the supported targets by volume.

I agree with that.

I see value in that, absolutely, but this isn't the way about it. Said new attribute (whether it be [[should_tail]] for [[nonportable_musttail]]) shouldn't be diagnosing the code generator, it should be diagnosing in SEMA. This requires encoding the platform specific/LLVM-IR specific rules. If LLVM-IR docs require musttail rules (that this patch seems to be breaking!), we perhaps need an alternative at the IR level as well.

Will the diagnostic be emitted to the correct location if there's no debug info?

I did some experiments and with no debug info, I got atleast the location of the caller. If we consider than currently there is no location, anything is improvement and this is probably best we can do.

Are there frontend tests for this diagnostic?

Do you mean C ---> ASM tests? I think we dont have such test, are they even allowed in Clang testsuite?

is a [[should_tail]] attribute sort of thing: a tail-hint where we do 'best effort with no promises', and make no guarantees that we're going to tail it.

I'm not sure I see the value in that. The compiler already optimizes tail calls when it can in a best-effort manner. The purpose of [[musttail]] is to support algorithms that would blow the stack if tail calls were not optimized. It's better to get a compiler error than to get a stack overflow at runtime (especially if stack overflow only occurs with certain inputs).

is a [[should_tail]] attribute sort of thing: a tail-hint where we do 'best effort with no promises', and make no guarantees that we're going to tail it.

I'm not sure I see the value in that. The compiler already optimizes tail calls when it can in a best-effort manner. The purpose of [[musttail]] is to support algorithms that would blow the stack if tail calls were not optimized. It's better to get a compiler error than to get a stack overflow at runtime (especially if stack overflow only occurs with certain inputs).

I see, yes, that makes sense. Then I go back to my other suggestion: we this needs to encode the per-platform limitations in Sema.

I could definitely see this being an 'alias' for musttail on most platforms, and us reducing the restrictions over time on a per-platform basis though, so long as we knew codegen would be compatible with it.

So to make some conclusion - there is a need for less strict version of musttail in LLVM, right? Ideally platform specific, like x86_musttail?

So to make some conclusion - there is a need for less strict version of musttail in LLVM, right? Ideally platform specific, like x86_musttail?

I'm not sure what the rules for the LLVM definition of musttail should be, but from the CFE's perspective, something like [[nonportable_musttail]] that always enforced the rules of the CURRENT platform(note: or stricter, and relaxing would be allowed later) would be acceptable.

[[nonportable_musttail]] makes sense to me as a semantic. It indicates that the algorithm requires tail calls, but the author is willing to accept that the algorithm may be non-portable.

"Non-portable" here can mean architecture-specific, but it can also mean "sensitive to compiler flags." For example, I think there are architectures where tail calls can be optimized for statically-linked code but not across a shared library boundary.

Since the circumstances for when tail calls are supported can be complicated and subtle, architecture-specific attributes don't make as much sense to me.

nickdesaulniers requested changes to this revision.Apr 7 2023, 3:41 PM

Any thoughts on diagnostics here? If I'm not mistaken, with this patch, if you request an impossible tail call, you get a crash with very little useful information. (Although, see https://discourse.llvm.org/t/rfc-improving-clangs-middle-and-back-end-diagnostics/69261 )

Patching this in, and running through the original test cases as reported in https://github.com/llvm/llvm-project/issues/54964#issue-1207256071:

$ clang++ x.cpp
x.cpp:5:32: error: cannot perform a tail call to function 'base' because its signature is incompatible with the calling function
int f1() { [[clang::musttail]] return base(5); }
                               ^
x.cpp:1:1: note: target function has different number of parameters (expected 0 but has 1)
int base(int);
^
x.cpp:5:14: note: tail call required by 'musttail' attribute here
int f1() { [[clang::musttail]] return base(5); }
             ^
x.cpp:6:57: error: cannot perform a tail call to function 'base' because its signature is incompatible with the calling function
int f2(int x, int y /* unused */) { [[clang::musttail]] return base(5); }
                                                        ^
x.cpp:1:1: note: target function has different number of parameters (expected 2 but has 1)
int base(int);
^
x.cpp:6:39: note: tail call required by 'musttail' attribute here
int f2(int x, int y /* unused */) { [[clang::musttail]] return base(5); }
                                      ^
2 errors generated.

so that's pretty informative to the user. Changing musttail to nonportable_musttail:

$ clang++ x.cpp
cannot guarantee tail call due to mismatched parameter counts
  %call = musttail call noundef i32 @_Z4basei(i32 noundef 5)
in function _Z2f1v
fatal error: error in backend: Broken function found, compilation aborted!
clang++: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 17.0.0 (git@github.com:llvm/llvm-project.git 36e13ec194d1601499e0069b33b9bd8f69887e1f)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /android0/llvm-project/llvm/build/bin
clang++: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang++: note: diagnostic msg: /tmp/x-299fe3.cpp
clang++: note: diagnostic msg: /tmp/x-299fe3.sh
clang++: note: diagnostic msg: 

********************

So at the very least, this should use the existing clang support for backend diagnostics rather than crashing. Please see how !srcloc works for inline asm and __attribute__((warning(""))). It should be straightforward to attach that ad-hoc metadata to call sites with that statement attribute.

Also, "woof" to the attribute name.

Taking a step back: can't we just fix the backends or modify CheckTypesMatch (or add a new method) to detect partial matches? Is there actually a safety issue on any known target? If not, creating a separate attribute feels like working around a possible bug in a particular backend when that should probably be fixed instead. Reading https://github.com/llvm/llvm-project/issues/54964, I'm curious if folks have pursued @efriedma 's suggestion #2 from https://github.com/llvm/llvm-project/issues/54964#issuecomment-1101612886?

This revision now requires changes to proceed.Apr 7 2023, 3:41 PM

I'm curious if folks have pursued @efriedma 's suggestion #2 from https://github.com/llvm/llvm-project/issues/54964#issuecomment-1101612886?

This is something we mentioned here, perform target specific checks in Sema (but still it is not possible to catch everything in Sema!)

But the problem of strict musttail call maker stays here. So either we have to relax musttail or create new one.

I'm curious if folks have pursued @efriedma 's suggestion #2 from https://github.com/llvm/llvm-project/issues/54964#issuecomment-1101612886?

This is something we mentioned here, perform target specific checks in Sema (but still it is not possible to catch everything in Sema!)

But the problem of strict musttail call maker stays here. So either we have to relax musttail or create new one.

(After reading more sources to see where musttail fails: https://github.com/llvm/llvm-project/issues/54964#issuecomment-1500708651)

I'm not against a new one, but I'll retain my "Changes Requested" on this patch until the UX is fixed. Restating:

So at the very least, this should use the existing clang support for backend diagnostics rather than crashing. Please see how !srcloc works for inline asm and attribute((warning(""))). It should be straightforward to attach that ad-hoc metadata to call sites with that statement attribute.

Do not crash the compiler like this.

yamt added a subscriber: yamt.Jun 24 2023, 12:32 PM

musttail provide assurances that the tail call can be optimized on all targets.

Such a definition doesn't make much sense because for some targets tail call optimization isn't available at all.
Eg. Wasm w/o tail call proposal, xtensa windowed abi.

So I'm not sure what's the point of the new attribute. It seems simpler to fix the existing musttail attribute.

musttail provide assurances that the tail call can be optimized on all targets.

Such a definition doesn't make much sense because for some targets tail call optimization isn't available at all.
Eg. Wasm w/o tail call proposal, xtensa windowed abi.

So I'm not sure what's the point of the new attribute. It seems simpler to fix the existing musttail attribute.

I agree, I consider the original design of this feature as a mistake, which made this attribute kinda unreliable and useless.. I dont think people are willing to just change existing impl of musttail attribute...

Such a definition doesn't make much sense because for some targets tail call optimization isn't available at all.
Eg. Wasm w/o tail call proposal, xtensa windowed abi.

That is true. But what is the correct behavior if you try to use [[clang::musttail]] on such a target? In my case, I want to get a compiler error, because if a tail call cannot be performed, the algorithm is incorrect (it will cause stack overflow at runtime).

There are three possible semantics that I can see:

  1. Perform a tail call if possible.
  2. Perform a tail call, fail to compile if this is not possible ([[clang::nonportable_musttail]])
  3. Perform a tail call, fail to compile if this is not possible or if the function signatures between caller and callee differ in ways that make tail calls difficult on some platforms ([[clang::musttail]])

Both (2) and (3) would fail to compile on Wasm without tail calls. If all you want is (1), that doesn't require any attribute (the optimizer will do it automatically).

xbolva00 added a comment.EditedJun 24 2023, 1:56 PM

Such a definition doesn't make much sense because for some targets tail call optimization isn't available at all.
Eg. Wasm w/o tail call proposal, xtensa windowed abi.

That is true. But what is the correct behavior if you try to use [[clang::musttail]] on such a target? In my case, I want to get a compiler error, because if a tail call cannot be performed, the algorithm is incorrect (it will cause stack overflow at runtime).

There are three possible semantics that I can see:

  1. Perform a tail call if possible.
  2. Perform a tail call, fail to compile if this is not possible ([[clang::nonportable_musttail]])
  3. Perform a tail call, fail to compile if this is not possible or if the function signatures between caller and callee differ in ways that make tail calls difficult on some platforms ([[clang::musttail]])

Both (2) and (3) would fail to compile on Wasm without tail calls. If all you want is (1), that doesn't require any attribute (the optimizer will do it automatically).

So clearly definition “musttail provide assurances that the tail call can be optimized on all targets.” is broken in reality.

Code with musttail will compile fine on x86, and musttail ensures that should compile to tail calls on other targets - not a case, on eg. wasm you get just an error. Or what is your definition of “all targets”? Seems fragile in practise, there are crashes even on arm..

xbolva00 added a comment.EditedJun 24 2023, 2:02 PM

If all you want is (1), that doesn't require any attribute (the optimizer will do it automatically).

Not entirely true, as compiler can do anything. In some cases you really need *musttail* (in llvm IR) to get or preserve tail call.

As current musttail is too restricted, we need nonportable_musttail (aka real musttail one would want)

I think I misunderstood @yamt's comment in https://reviews.llvm.org/D147714#4446780. I take back what I wrote above.

I agree that [[clang::nonportable_musttail]] is a nicer semantic, and the restrictions around the existing [[clang:musttail]] don't seem to buy us very much, since they are not universal enough to give a true assurance.

If someone could land a change to both LLVM and Clang to change the existing attribute, I would have no objection. But I have no idea what is involved in landing a backend (LLVM) change of that magnitude, especially since it would touch all the arch-specific backends.