Tail Calls were initially disabled for PC Relative code because it was not safe to make certain assumptions about the tail calls (namely that all compiled functions no longer used the TOC pointer in R2). However, once all of the TOC pointer references have been removed it is safe to tail call everything that was tail called prior to the PC relative additions as well as a number of new cases.
For example, it is now possible to tail call indirect functions as there is no need to save and restore the TOC pointer for indirect functions if the caller is marked as may clobber R2 (st_other=1). For the same reason it is now also possible to tail call functions that are external.
Details
- Reviewers
nemanjai lei hfinkel kamaub - Group Reviewers
Restricted Project - Commits
- rG1354a03e74c2: [PowerPC][Future] Implement PC Relative Tail Calls
Diff Detail
Event Timeline
Some nit comments added
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
5509 | nit: would it be better to adjust the if check in this way? if (CFlags.IsTailCall) { if (!(CFlags.IsIndirect && Subtarget.isUsingPCRelativeCalls())) { assert(((Callee.getOpcode() == ISD::Register && cast<RegisterSDNode>(Callee)->getReg() == PPC::CTR) || Callee.getOpcode() == ISD::TargetExternalSymbol || Callee.getOpcode() == ISD::TargetGlobalAddress || isa<ConstantSDNode>(Callee)) && "Expecting a global address, external symbol, absolute value or " "register"); assert(CallOpc == PPCISD::TC_RETURN && "Unexpected call opcode for a tail call."); } DAG.getMachineFunction().getFrameInfo().setHasTailCall(); return DAG.getNode(CallOpc, dl, MVT::Other, Ops); } | |
llvm/lib/Target/PowerPC/PPCMCInstLower.cpp | ||
88–89 | nit: Can we move unsigned MIOpcode = MI->getOpcode(); down after line 97 and combined the check in line 91 with line 99? | |
96 | nit: can we move the check before line 99 to save some check for non-pcrel mode? | |
llvm/test/CodeGen/PowerPC/pcrel-tail-calls.ll | ||
6 | good to add a general description for the test |
llvm/test/CodeGen/PowerPC/pcrel-tail-calls.ll | ||
---|---|---|
6 | nit: checks -> check. behaviour -> behavior |
llvm/test/CodeGen/PowerPC/pcrel-tail-calls.ll | ||
---|---|---|
6 | Fixed the checks but I'll leave the spelling of behaviour/behavior. :) |
None of the comments on their own really require another round of review, but there are a number of them so I'd prefer to see the updates before approving them.
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp | ||
---|---|---|
1690 | Maybe some additional info in this comment along the lines of: // The TCRETURNdi variants are direct calls. Valid targets for those are // MO_GlobalAddress operands as well as MO_ExternalSymbol with PC-Rel // since we can tail call external functions with PC-Rel (i.e. don't need to // worry about different TOC pointers). | |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
4796 | It is interesting that we have !isa<ExternalSymbolSDNode>(Callee) when the subsequent check will immediately return false if !isFunctionGlobalAddress(Callee). // All variants of 64-bit ELF ABIs without PC-Relative addressing // require that the caller and callee share the same TOC for // TCO/SCO. We cannot guarantee this for indirect // calls or calls to external functions. When PC-Relative addressing // is used, the concept of the TOC is no longer applicable so this // check is not required. if (!Subtarget.isUsingPCRelativeCalls() && !callsShareTOCBase(&Caller, Callee, getTargetMachine())) return false; | |
5511 | With non-assert builds, this will be an empty block so we should write this differently. Also, the second assert seems perfectly valid in both cases - won't the opcode still be PPCISD::TC_RETURN even for indirect tail calls? assert(((Callee.getOpcode() == ISD::Register && cast<RegisterSDNode>(Callee)->getReg() == PPC::CTR) || Callee.getOpcode() == ISD::TargetExternalSymbol || Callee.getOpcode() == ISD::TargetGlobalAddress || isa<ConstantSDNode>(Callee) || (CFlags.IsIndirect && Subtarget.isUsingPCRelativeCalls()) && "Expecting a global address, external symbol, absolute value, " "register or an indirect call (with PC-Rel addressing)"); | |
5587–5599 | I think it is time to refactor this debug dump to something more meaningful. We no longer check the linkage of the callee for deciding to tail call so it seems quite meaningless to show it in the debug dump. LLVM_DEBUG(dbgs() << "TCO caller: " << DAG.getMachineFunction().getName() << "\nTCO callee: "); LLVM_DEBUG(Callee.dump()); And then we don't need to wrap this in an NDEBUG macro. | |
llvm/lib/Target/PowerPC/PPCMCInstLower.cpp | ||
102 | Please don't create conditional block that contain only asserts. assert((Subtarget->isUsingPCRelativeCalls() || MIOpcode != PPC::BL8_NOTOC) && "BL8_NOTOC is only valid when using PC Relative Calls."); | |
llvm/test/CodeGen/PowerPC/pcrel-tail-calls.ll | ||
4 | Why are we adding -enable-ppc-quad-precision? I don't see any use of float128. |
Replied to comments. I'll update the patch in a couple of min.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
4796 | Actually we do need both guards. When I first saw your comment I thought the same thing but then I realized (after ppc64-nonfunc-calls.ll failed) that isFunctionGlobalAddress is not the same as GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee); which is tested as the first thing in callsShareTOCBase. static bool isFunctionGlobalAddress(SDValue Callee) { if (GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee)) { if (Callee.getOpcode() == ISD::GlobalTLSAddress || Callee.getOpcode() == ISD::TargetGlobalTLSAddress) return false; return G->getGlobal()->getValueType()->isFunctionTy(); } return false; } Take the case where we have a GlobalSDNode where the value type is not a function type. (This would be the case for indirect calls.) The isFunctionGlobalAddress would return false but dyn_cast<GlobalAddressSDNode>(Callee); would not. | |
5587–5599 | FYI: typedef int (*ptrfunc) (); int TailCallParamFuncPtr(ptrfunc passedfunc) { return (*passedfunc) (); } Produces this debug output: TCO caller: TailCallParamFuncPtr TCO callee: t2: i64,ch = CopyFromReg t0, Register:i64 %0 Not super useful but much better than the nothing we would output before. :) | |
llvm/test/CodeGen/PowerPC/pcrel-tail-calls.ll | ||
4 | Because I copy-pasted these lines form another test? Sorry, I'll fix it. |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
1549–1550 | nit : asm so we have -> asm as we have | |
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp | ||
1694 | nits: (i.e don't) -> (i.e we don't) | |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
4790 | nit: It seems you need to format these comments, they seem to be wrapping around even before reaching 80 chars. |
Thanks for addressing my comments. The remaining nits do not at all seem like a reason to block the approval. LGTM.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
4796 | Oh, I see. That's great, thanks for the explanation. | |
5508–5509 | Nit: the message for the assert didn't change according to my comments. | |
5587–5599 | I think that's fairly useful. It really is a value that comes in from outside the function and we don't know anything more about it. The best we can do seems adequate to me. |
Update the comment to reflect the added condition.