This is just closing the loop for
https://www.mail-archive.com/cfe-commits@lists.llvm.org/msg28837.html with a
test case and fixes PR28748 which had been introduced by r244207.
Details
- Reviewers
eli.friedman
Diff Detail
Event Timeline
The test seems a little large, the following shows that we emit a tail call with a byval argument on trunk.
struct LARGE { union { int i; }; }; struct I { virtual void m_fn1(LARGE); }; struct CBase { virtual ~CBase(); }; struct C : CBase, I { void Seek(LARGE); }; void C::Seek(LARGE) {}
Please add the options you used to compile? I can certainly shrink the test case a bit before I commit.
Nope, I don't see the tail call. Anyway, I'll simplify my test case. Don't worry about it.
clang++ -cc1 -x c++ -emit-llvm -triple i386-apple-darwin9 t.cpp
cat t.ll:
; ModuleID = 't.cpp'
source_filename = "t.cpp"
target datalayout = "e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128"
target triple = "i386-apple-darwin9"
%struct.C = type { %struct.CBase, %struct.I }
%struct.CBase = type { i32 (...) }
%struct.I = type { i32 (...) }
%struct.LARGE = type { %union.anon }
%union.anon = type { i32 }
; Function Attrs: nounwind
define void @_ZN1C4SeekE5LARGE(%struct.C* %this, %struct.LARGE* byval align 4) #0 align 2 {
entry:
%this.addr = alloca %struct.C*, align 4 store %struct.C* %this, %struct.C** %this.addr, align 4 %this1 = load %struct.C*, %struct.C** %this.addr, align 4 ret void
}
attributes #0 = { nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-features"="+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
D'oh, that m_fn1 should have been Seek :/
$ cat t.ii
struct LARGE { union { int i; }; }; struct I { virtual void m_fn1(LARGE); }; struct CBase { virtual ~CBase(); }; struct C : CBase, I { void m_fn1(LARGE); }; void C::m_fn1(LARGE) {}
$ ~/llvm/Debug+Asserts/bin/clang -cc1 -x c++ -emit-llvm -triple i386-apple-darwin9 t.ii && grep tail t.ll
tail call void @_ZN1C5m_fn1E5LARGE(%struct.C* %3, %struct.LARGE* byval align 4 %0)
ISTM that the DWARF spec intended such thunks to be encoded as DW_AT_trampoline. That seems more appropriate than relying on codegen emitting a tailcall. This way the debugger can make the policy decision of whether or not thunks should show up in the backtrace.
In any case, correctness must always trump all else. Reverting to green should take precedence over a QoI bug like PR24235.
ISTM that the DWARF spec intended such thunks to be encoded as DW_AT_trampoline. That seems more appropriate than relying on codegen emitting a tailcall. This way the debugger can make the policy decision of whether or not thunks should show up in the backtrace.
In any case, correctness must always trump all else. Reverting to green should take precedence over a QoI bug like PR24235.
I agree to the revert, though I am not sure about the new test, it looks too complected, especially the command line.
I will let David decide on accepting that test or ask for improvement.
Regards,
Amjad
I think the correctness problem is in DSE. It should be safe for clang to put 'tail' here. The tail call does not capture the address of any local variables, but it does load from them via 'byval'.
So, if clang were to use a temporary alloca for the byval parameter, then yes, I agree marking it as a tail call would be incorrect. However, clang doesn't use an alloca, it forwards the byval pointer parameter directly to the callee:
define i32 @_ZThn4_N1C4SeekE6_LARGE(%class.C* nocapture readnone %this, %union._LARGE* byval nocapture readonly align 4 %L) unnamed_addr #0 align 2 { entry: %call = tail call i32 @_ZN1C4SeekE6_LARGE(%class.C* undef, %union._LARGE* byval nonnull align 4 %L) ret i32 %call }
Maybe the test case is over-reduced, or the problematic IR was produced by an older version of clang?
For the IR level, I think this has got to be valid:
declare void @bar(i32* byval %p) define void @foo(i32* byval %p) { tail call void @bar(i32* byval %p.tmp) ret void }
The tail is an aliasing property which indicates that the callee doesn't touch any of the alloca's in the caller, a rough proxy for "stack" since there is no other stack in LLVM IR. That doesn't mean that we're going to actually lower it to a tail call in the final assembly, but if we don't put tail on a call, we won't even consider it for becoming a tail call. I think the backend can sort out whether the byval is going to lead to a stack allocation in @foo and emit a non-tail call if so.