We used to do that, and it makes no sense. I haven't looked at other targets, but: ARM and X86 already disable tail-calls whenever there's sret (caller or callee, explicit or implicit), but AArch64 bravely tries to continue. In some cases, it generated broken code such as:
_test_tailcall_sret: sub sp, sp, #128 mov x8, sp add sp, sp, #128 b _test_sret
for:
declare i1024 @test_sret() #0 define i1024 @test_tailcall_sret() #0 { %a = tail call i1024 @test_sret() ret i1024 %a }
There are two parts to this (I'll commit separately, but it makes sense to review together):
- implicit sret: this will be part of the stack frame, so there's no way we can tail-call
- explicit sret: a good enough approximation is: if the sret pointer is an Instruction, it might be function-local (alloca, usually).
In practice, both of these don't happen with well-behaved frontends such as clang, which will have an explicit sret, and forward it across tail calls.
Also, I say approximation because there's one case we pessimize (the GEP testcase), but that's a really weird situation.. Having a GetUnderlyingObject around the sret pointer origin check does the trick though, so I can add it if desired.
-Ahmed