In D50679 the alias analysis rules were changed with regards to tail calls and byval arguments. Previously, tail calls were assumed not to alias allocas from the current frame. This has been updated, to not assume this for arguments with the byval attribute.
With this change, TailCallElim can now be more aggressive and mark more calls as tails, e.g.:
define void @test() { %f = alloca %struct.foo call void @bar(%struct.foo* byval %f) ret void } define void @test2(%struct.foo* byval %f) { call void @bar(%struct.foo* byval %f) ret void } define void @test3(%struct.foo* byval %f) { %agg.tmp = alloca %struct.foo %0 = bitcast %struct.foo* %agg.tmp to i8* %1 = bitcast %struct.foo* %f to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 40, i1 false) call void @bar(%struct.foo* byval %agg.tmp) ret void }
The problematic case where a byval parameter is captured by a call is still handled correctly, and will not be marked as a tail (see PR7272).
After this change the following code:
-------------- test.cpp -------------- struct foo { int i[10]; }; extern void bar(struct foo f); void foo(struct foo f) { bar(f); } --------------------------------------
Now produces on Linux x86-64:
_Z3foo3foo: # @_Z3foo3foo jmp _Z3bar3foo # TAILCALL
Previously it gave:
_Z3foo3foo: # @_Z3foo3foo subq $40, %rsp movq 80(%rsp), %rax movq %rax, 32(%rsp) movaps 48(%rsp), %xmm0 movaps 64(%rsp), %xmm1 movups %xmm1, 16(%rsp) movups %xmm0, (%rsp) callq _Z3bar3foo addq $40, %rsp retq
The new output matches gcc.
Fixes PR38862.
I think you may also want to change the logic in callUsesLocalStack; this tracks users of an alloca when the callee could leak/escape this alloca pointer value to a place where a future call could obtain it, even if it isn't passed in as an argument (say, written to a global here and read back in a future call). If the pointer received by the callee is not truly the pointer to the alloca, then this couldn't happen. This case really matters for cycles in the CFG, see the comment on DeferredTails.