This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC][Future] Enable Tail Calls for PC Relative Code
ClosedPublic

Authored by stefanp on Apr 9 2020, 3:34 AM.

Details

Summary

Tail Calls were initially disabled for PC Relative code because it was not safe to make certain assumptions about the tail calls (namely that all compiled functions no longer used the TOC pointer in R2). However, once all of the TOC pointer references have been removed it is safe to tail call everything that was tail called prior to the PC relative additions as well as a number of new cases.
For example, it is now possible to tail call indirect functions as there is no need to save and restore the TOC pointer for indirect functions if the caller is marked as may clobber R2 (st_other=1). For the same reason it is now also possible to tail call functions that are external.

Diff Detail

Event Timeline

stefanp created this revision.Apr 9 2020, 3:34 AM
kamaub added a subscriber: kamaub.Apr 9 2020, 8:10 AM
kamaub added inline comments.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1544–1545

Update the comment to reflect the added condition.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5615–5618

Please put condition inside an #ifndef NDEBUG block, there may be flags for an empty if condition when building without asserts otherwise.

NeHuang added a subscriber: NeHuang.Apr 9 2020, 2:59 PM

Some nit comments added

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5537

nit: would it be better to adjust the if check in this way?

if (CFlags.IsTailCall) {
  if (!(CFlags.IsIndirect && Subtarget.isUsingPCRelativeCalls())) {
    assert(((Callee.getOpcode() == ISD::Register &&
             cast<RegisterSDNode>(Callee)->getReg() == PPC::CTR) ||
            Callee.getOpcode() == ISD::TargetExternalSymbol ||
            Callee.getOpcode() == ISD::TargetGlobalAddress ||
            isa<ConstantSDNode>(Callee)) &&
           "Expecting a global address, external symbol, absolute value or "
           "register");
    assert(CallOpc == PPCISD::TC_RETURN &&
           "Unexpected call opcode for a tail call.");
  }
  DAG.getMachineFunction().getFrameInfo().setHasTailCall();
  return DAG.getNode(CallOpc, dl, MVT::Other, Ops);
}
llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
88–89

nit: Can we move unsigned MIOpcode = MI->getOpcode(); down after line 97 and combined the check in line 91 with line 99?

96

nit: can we move the check before line 99 to save some check for non-pcrel mode?

llvm/test/CodeGen/PowerPC/pcrel-tail-calls.ll
6

good to add a general description for the test

stefanp updated this revision to Diff 256963.Apr 13 2020, 5:28 AM

Addressed review comments.

stefanp marked 6 inline comments as done.Apr 13 2020, 5:29 AM
NeHuang added inline comments.Apr 14 2020, 2:33 PM
llvm/test/CodeGen/PowerPC/pcrel-tail-calls.ll
7

nit: checks -> check. behaviour -> behavior

amyk added a subscriber: amyk.Apr 14 2020, 8:34 PM
amyk added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5608

s/for callee/for the callee

5611

s/ExtrernalSymbolSDNode/ExternalSymbolSDNode

stefanp updated this revision to Diff 257752.Apr 15 2020, 9:34 AM

Addressed review nits.

stefanp marked 4 inline comments as done.Apr 15 2020, 9:38 AM
stefanp added inline comments.
llvm/test/CodeGen/PowerPC/pcrel-tail-calls.ll
7

Fixed the checks but I'll leave the spelling of behaviour/behavior. :)

nemanjai requested changes to this revision.Apr 16 2020, 4:55 AM

None of the comments on their own really require another round of review, but there are a number of them so I'd prefer to see the updates before approving them.

llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
1677–1683

Maybe some additional info in this comment along the lines of:

// The TCRETURNdi variants are direct calls. Valid targets for those are
// MO_GlobalAddress operands as well as MO_ExternalSymbol with PC-Rel
// since we can tail call external functions with PC-Rel (i.e. don't need to
// worry about different TOC pointers).
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
4844

It is interesting that we have !isa<ExternalSymbolSDNode>(Callee) when the subsequent check will immediately return false if !isFunctionGlobalAddress(Callee).
I think we should simplify this into:

// All variants of 64-bit ELF ABIs without PC-Relative addressing
// require that the caller and callee share the same TOC for
// TCO/SCO. We cannot guarantee this for indirect
// calls or calls to external functions. When PC-Relative addressing
// is used, the concept of the TOC is no longer applicable so this
// check is not required.
if (!Subtarget.isUsingPCRelativeCalls() &&
    !callsShareTOCBase(&Caller, Callee, getTargetMachine()))
  return false;
5552

With non-assert builds, this will be an empty block so we should write this differently. Also, the second assert seems perfectly valid in both cases - won't the opcode still be PPCISD::TC_RETURN even for indirect tail calls?
Perhaps we should just add another disjunction to the assert:

assert(((Callee.getOpcode() == ISD::Register &&
         cast<RegisterSDNode>(Callee)->getReg() == PPC::CTR) ||
        Callee.getOpcode() == ISD::TargetExternalSymbol ||
        Callee.getOpcode() == ISD::TargetGlobalAddress ||
        isa<ConstantSDNode>(Callee) ||
        (CFlags.IsIndirect && Subtarget.isUsingPCRelativeCalls()) &&
       "Expecting a global address, external symbol, absolute value, "
       "register or an indirect call (with PC-Rel addressing)");
5615–5618

I think it is time to refactor this debug dump to something more meaningful. We no longer check the linkage of the callee for deciding to tail call so it seems quite meaningless to show it in the debug dump.
I think we can simplify this to something like:

LLVM_DEBUG(dbgs() << "TCO caller: " << DAG.getMachineFunction().getName()
                  << "\nTCO callee: ");
LLVM_DEBUG(Callee.dump());

And then we don't need to wrap this in an NDEBUG macro.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
102

Please don't create conditional block that contain only asserts.
This assert can be placed before this code and just be

assert((Subtarget->isUsingPCRelativeCalls() ||
        MIOpcode != PPC::BL8_NOTOC) &&
        "BL8_NOTOC is only valid when using PC Relative Calls.");
llvm/test/CodeGen/PowerPC/pcrel-tail-calls.ll
4

Why are we adding -enable-ppc-quad-precision? I don't see any use of float128.

This revision now requires changes to proceed.Apr 16 2020, 4:55 AM
stefanp marked 4 inline comments as done.Apr 17 2020, 11:38 AM

Replied to comments. I'll update the patch in a couple of min.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
4844

Actually we do need both guards. When I first saw your comment I thought the same thing but then I realized (after ppc64-nonfunc-calls.ll failed) that isFunctionGlobalAddress is not the same as GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee); which is tested as the first thing in callsShareTOCBase.
Here is that check. This function can still return false even if Callee is a GlobalAddressSDNode.

static bool isFunctionGlobalAddress(SDValue Callee) {
  if (GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee)) {
    if (Callee.getOpcode() == ISD::GlobalTLSAddress ||
        Callee.getOpcode() == ISD::TargetGlobalTLSAddress)
      return false;

    return G->getGlobal()->getValueType()->isFunctionTy();
  }

  return false;
}

Take the case where we have a GlobalSDNode where the value type is not a function type. (This would be the case for indirect calls.) The isFunctionGlobalAddress would return false but dyn_cast<GlobalAddressSDNode>(Callee); would not.
I've kept both if conditions but I've flattened them so that Subtarget.isUsingPCRelativeCalls() is in both.

5615–5618

FYI:
Yes, these dubug messages are more useful than what we had before.
However, the Callee.dump() won't always give the user useful information. For indirect calls the Callee node is sometimes a copy like this one:
C Code:

typedef int (*ptrfunc) ();
int TailCallParamFuncPtr(ptrfunc passedfunc) {
  return (*passedfunc) ();
}

Produces this debug output:

TCO caller: TailCallParamFuncPtr
TCO callee: t2: i64,ch = CopyFromReg t0, Register:i64 %0

Not super useful but much better than the nothing we would output before. :)
Also, since it's an indirect call we have no way of knowing the name of the function. This is probably the best we can do.

llvm/test/CodeGen/PowerPC/pcrel-tail-calls.ll
4

Because I copy-pasted these lines form another test? Sorry, I'll fix it.

stefanp updated this revision to Diff 258384.Apr 17 2020, 11:40 AM

Addressed Review Comments.

anil9 added a subscriber: anil9.Apr 19 2020, 6:50 PM
anil9 added inline comments.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1544–1545

nit : asm so we have -> asm as we have

llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
1681

nits: (i.e don't) -> (i.e we don't)
Some of the external funcions -> functions.
(like memcpy for example) -> like memcpy for example,

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
4823

nit: It seems you need to format these comments, they seem to be wrapping around even before reaching 80 chars.

nemanjai accepted this revision.Apr 20 2020, 9:20 PM

Thanks for addressing my comments. The remaining nits do not at all seem like a reason to block the approval. LGTM.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
4844

Oh, I see. That's great, thanks for the explanation.

5543–5548

Nit: the message for the assert didn't change according to my comments.

5615–5618

I think that's fairly useful. It really is a value that comes in from outside the function and we don't know anything more about it. The best we can do seems adequate to me.

This revision is now accepted and ready to land.Apr 20 2020, 9:20 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2020, 11:18 AM