This is an archive of the discontinued LLVM Phabricator instance.

Don't inline branch funnels
ClosedPublic

Authored by vitalybuka on Mar 30 2018, 4:28 PM.

Details

Summary

Inlined branch funnels fail expectation of LowerTypeTestsModule::lower()

Diff Detail

Event Timeline

vitalybuka created this revision.Mar 30 2018, 4:28 PM
pcc added a subscriber: rnk.Mar 30 2018, 4:42 PM

Probably a better fix would be to teach the inliner not to inline functions containing musttail calls if the musttail call involves an intrinsic.

In D45116#1053537, @pcc wrote:

Probably a better fix would be to teach the inliner not to inline functions containing musttail calls if the musttail call involves an intrinsic.

Why? Is inlining of function with musttail on intrinsics always unacceptable?

pcc added a comment.Mar 30 2018, 5:13 PM

Yes, AFAIK the only intrinsic that supports musttail is llvm.icall.branch.funnel, and inlining is invalid for it.

In D45116#1053547, @pcc wrote:

Yes, AFAIK the only intrinsic that supports musttail is llvm.icall.branch.funnel, and inlining is invalid for it.

Is inlining incorrect in general, for any future musttail on intrinsic? If not it sounds as incorrect overgeneralization by single instance.

pcc added a comment.Mar 30 2018, 5:26 PM

We will need to allow inlining on a case by case basis for new intrinsics, as we won't necessarily know whether the intrinsic will be able to tell which arguments are for the intrinsic and which are for the called function.

In D45116#1053552, @pcc wrote:

We will need to allow inlining on a case by case basis for new intrinsics, as we won't necessarily know whether the intrinsic will be able to tell which arguments are for the intrinsic and which are for the called function.

Done

pcc added inline comments.Mar 30 2018, 7:03 PM
llvm/lib/Analysis/InlineCost.cpp
1970 ↗(On Diff #140519)

Move this analysis to CallAnalyzer::visitCallSite.

llvm/test/Transforms/WholeProgramDevirt/branch-funnel.ll
5

I would write this as a direct test of the inliner rather than the -O3 pipeline.

vitalybuka updated this revision to Diff 140695.Apr 2 2018, 3:07 PM

inliner test

vitalybuka marked an inline comment as done.Apr 2 2018, 3:09 PM
vitalybuka added inline comments.
llvm/test/Transforms/WholeProgramDevirt/branch-funnel.ll
5

I've added inliner test, but I'd keeps this one as well to make sure that other optimizations do not affect BFs

You also need to update llvm::isInlineViable.

llvm/lib/Analysis/InlineCost.cpp
1229 ↗(On Diff #140695)

Leftover code from testing?

vitalybuka updated this revision to Diff 140697.Apr 2 2018, 3:31 PM
vitalybuka marked an inline comment as done.

remove testing change

You also need to update llvm::isInlineViable.

Done.

vitalybuka updated this revision to Diff 140704.Apr 2 2018, 4:30 PM

update isInlineViable

vitalybuka updated this revision to Diff 140706.Apr 2 2018, 4:32 PM

restore accidentally deleted code

rnk added a comment.Apr 2 2018, 5:04 PM

Why is it invalid to inline the call to the branch funnel?

In D45116#1055076, @rnk wrote:

Why is it invalid to inline the call to the branch funnel?

Branch funnels can look look like this:

define hidden void @__typeid_typeid1_0_branch_funnel(i8* nest, ...) {
  musttail call void (...) @llvm.icall.branch.funnel(i8* %0, i8* bitcast ([1 x i8*]* @vt1_1 to i8*), i32 (i8*, i32)* @vf1_1, i8* bitcast ([1 x i8*]* @vt1_2 to i8*), i32 (i8*, i32)* @vf1_2, ...)
  ret void
}

If we inline it then LowerTypeTests.cpp (near line 1806) will not be able to separate call targets from target arguments (here ...).

rnk added a comment.Apr 3 2018, 10:39 AM

Thanks, I looked, and now I understand better what's going on. These are thunks, and we do in fact want the argument forwarding behavior of a variadic musttail call.

I guess one thing that is weird about @llvm.icall.branch.funnel is that it doesn't quite obey the musttail verifier rules, which are supposed to require that the prototype of the caller matches the prototype of the callee. This ensures that we never have to adjust the location of the return address on the stack when emitting the tail call. Maybe I forgot to check intrinsics, or the arguments that are part of the variadic pack in the verifier.

Have you considered passing these extra parameters as an operand bundle, or do those not support variable numbers of operands?

I think I actually prefer the original noinline marker patch to changing the inliner. @pcc, what do you think?

pcc added a comment.Apr 3 2018, 11:21 AM

Maybe I forgot to check intrinsics, or the arguments that are part of the variadic pack in the verifier.

I had to disable the verifier check for intrinsics because the check is not appropriate for all intrinsics, specifically this one.

Have you considered passing these extra parameters as an operand bundle, or do those not support variable numbers of operands?

Something like that might work, but even if we were to represent the target list like that, that wouldn't be enough, because the backend would also need to know how to emit the other arguments as if they were being passed as part of a regular call.

That is, not being able to inline arguments into an intrinsic is effectively a backend limitation that the inliner ought to respect. If that changes, we can change the inliner as well.

So what should I do about this patch?

rnk added a comment.Apr 3 2018, 3:17 PM
In D45116#1055897, @pcc wrote:

Maybe I forgot to check intrinsics, or the arguments that are part of the variadic pack in the verifier.

I had to disable the verifier check for intrinsics because the check is not appropriate for all intrinsics, specifically this one.

Have you considered passing these extra parameters as an operand bundle, or do those not support variable numbers of operands?

Something like that might work, but even if we were to represent the target list like that, that wouldn't be enough, because the backend would also need to know how to emit the other arguments as if they were being passed as part of a regular call.

That is, not being able to inline arguments into an intrinsic is effectively a backend limitation that the inliner ought to respect. If that changes, we can change the inliner as well.

I think I see what you're saying. Because this intrinsic doesn't go through the normal ISel call lowering code paths, it doesn't have any logic for getting the rest of the arguments materialized into registers or stack slots. The musttail marking is used primarily to make sure that IR passes don't screw things up and move the intrinsic call out of tail position.

The current lowering seems a bit dangerous, because if some pass inserted some computation (__fentry calls maybe?) to a branch funnel function, you haven't created virtual registers connecting the incoming argument registers to the outgoing argument registers used by the tail call, and they won't be spilled and preserved. I think fixing that is just a matter of doing this stuff when you make your TAILJMP64d instructions:

if (isVarArg && IsMustTail) {
  const auto &Forwards = X86Info->getForwardedMustTailRegParms();
  for (const auto &F : Forwards) {
    SDValue Val = DAG.getCopyFromReg(Chain, dl, F.VReg, F.VT);
    RegsToPass.push_back(std::make_pair(unsigned(F.PReg), Val));
  }
}

... except you'd add MI copies and operands instead of DAG nodes.

So what should I do about this patch?

I think we shouldn't overgeneralize here. In the future, we can always add new intrinsics which don't work when we inline them. See llvm.local.escape. This seems like one more thing that we just don't know how to inline.

I think I would generalize HasFrameEscape to HasUninlineableIntrinsic and set that if you see a musttail call of this intrinsic. For example, I can imagine someone wanting to do guaranteed tail calls with some kind of statepoint intrinsic. We shouldn't block inlining that.

pcc added a comment.Apr 3 2018, 3:25 PM
In D45116#1056213, @rnk wrote:
In D45116#1055897, @pcc wrote:

Maybe I forgot to check intrinsics, or the arguments that are part of the variadic pack in the verifier.

I had to disable the verifier check for intrinsics because the check is not appropriate for all intrinsics, specifically this one.

Have you considered passing these extra parameters as an operand bundle, or do those not support variable numbers of operands?

Something like that might work, but even if we were to represent the target list like that, that wouldn't be enough, because the backend would also need to know how to emit the other arguments as if they were being passed as part of a regular call.

That is, not being able to inline arguments into an intrinsic is effectively a backend limitation that the inliner ought to respect. If that changes, we can change the inliner as well.

I think I see what you're saying. Because this intrinsic doesn't go through the normal ISel call lowering code paths, it doesn't have any logic for getting the rest of the arguments materialized into registers or stack slots. The musttail marking is used primarily to make sure that IR passes don't screw things up and move the intrinsic call out of tail position.

The current lowering seems a bit dangerous, because if some pass inserted some computation (__fentry calls maybe?) to a branch funnel function, you haven't created virtual registers connecting the incoming argument registers to the outgoing argument registers used by the tail call, and they won't be spilled and preserved. I think fixing that is just a matter of doing this stuff when you make your TAILJMP64d instructions:

if (isVarArg && IsMustTail) {
  const auto &Forwards = X86Info->getForwardedMustTailRegParms();
  for (const auto &F : Forwards) {
    SDValue Val = DAG.getCopyFromReg(Chain, dl, F.VReg, F.VT);
    RegsToPass.push_back(std::make_pair(unsigned(F.PReg), Val));
  }
}

... except you'd add MI copies and operands instead of DAG nodes.

Are you sure that would work? The pass runs after RA.

So what should I do about this patch?

I think we shouldn't overgeneralize here. In the future, we can always add new intrinsics which don't work when we inline them. See llvm.local.escape. This seems like one more thing that we just don't know how to inline.

I think I would generalize HasFrameEscape to HasUninlineableIntrinsic and set that if you see a musttail call of this intrinsic. For example, I can imagine someone wanting to do guaranteed tail calls with some kind of statepoint intrinsic. We shouldn't block inlining that.

That sounds fine to me.

rnk added a comment.Apr 3 2018, 3:28 PM
In D45116#1056216, @pcc wrote:

Are you sure that would work? The pass runs after RA.

Good point. You might want X86ISelLowering to copy these registers into and out of the ICALL_BRANCH_FUNNEL node, though.

Don't inline @llvm.icall.branch.funnel

rnk accepted this revision.Apr 4 2018, 11:59 AM

Looks good to me, what do you think, Peter?

This revision is now accepted and ready to land.Apr 4 2018, 11:59 AM
pcc accepted this revision.Apr 4 2018, 12:05 PM

LGTM

This revision was automatically updated to reflect the committed changes.