This is an archive of the discontinued LLVM Phabricator instance.

In D45116#1053537, @pcc wrote:

Probably a better fix would be to teach the inliner not to inline functions containing musttail calls if the musttail call involves an intrinsic.

Why? Is inlining of function with musttail on intrinsics always unacceptable?

Yes, AFAIK the only intrinsic that supports musttail is llvm.icall.branch.funnel, and inlining is invalid for it.

In D45116#1053547, @pcc wrote:

Yes, AFAIK the only intrinsic that supports musttail is llvm.icall.branch.funnel, and inlining is invalid for it.

Is inlining incorrect in general, for any future musttail on intrinsic? If not it sounds as incorrect overgeneralization by single instance.

We will need to allow inlining on a case by case basis for new intrinsics, as we won't necessarily know whether the intrinsic will be able to tell which arguments are for the intrinsic and which are for the called function.

inliner

Herald added subscribers: haicheng, eraman. · View Herald TranscriptMar 30 2018, 6:44 PM

In D45116#1053552, @pcc wrote:

We will need to allow inlining on a case by case basis for new intrinsics, as we won't necessarily know whether the intrinsic will be able to tell which arguments are for the intrinsic and which are for the called function.

Done

Harbormaster completed remote builds in B16610: Diff 140519.Mar 30 2018, 6:45 PM

pcc added inline comments.Mar 30 2018, 7:03 PM

llvm/lib/Analysis/InlineCost.cpp
1970 ↗	(On Diff #140519)	Move this analysis to `CallAnalyzer::visitCallSite`.
llvm/test/Transforms/WholeProgramDevirt/branch-funnel.ll
6 ↗	(On Diff #140519)	I would write this as a direct test of the inliner rather than the `-O3` pipeline.

inliner test

Harbormaster completed remote builds in B16644: Diff 140695.Apr 2 2018, 3:09 PM

vitalybuka marked an inline comment as done.Apr 2 2018, 3:09 PM

vitalybuka added inline comments.

llvm/test/Transforms/WholeProgramDevirt/branch-funnel.ll
6 ↗	(On Diff #140519)	I've added inliner test, but I'd keeps this one as well to make sure that other optimizations do not affect BFs

You also need to update llvm::isInlineViable.

llvm/lib/Analysis/InlineCost.cpp
1229 ↗	(On Diff #140695)	Leftover code from testing?

remove testing change

Harbormaster completed remote builds in B16645: Diff 140697.Apr 2 2018, 3:34 PM

In D45116#1054958, @efriedma wrote:

You also need to update llvm::isInlineViable.

Done.

update isInlineViable

restore accidentally deleted code

Why is it invalid to inline the call to the branch funnel?

In D45116#1055076, @rnk wrote:

Why is it invalid to inline the call to the branch funnel?

Branch funnels can look look like this:

define hidden void @__typeid_typeid1_0_branch_funnel(i8* nest, ...) {
  musttail call void (...) @llvm.icall.branch.funnel(i8* %0, i8* bitcast ([1 x i8*]* @vt1_1 to i8*), i32 (i8*, i32)* @vf1_1, i8* bitcast ([1 x i8*]* @vt1_2 to i8*), i32 (i8*, i32)* @vf1_2, ...)
  ret void
}

If we inline it then LowerTypeTests.cpp (near line 1806) will not be able to separate call targets from target arguments (here ...).

Thanks, I looked, and now I understand better what's going on. These are thunks, and we do in fact want the argument forwarding behavior of a variadic musttail call.

I guess one thing that is weird about @llvm.icall.branch.funnel is that it doesn't quite obey the musttail verifier rules, which are supposed to require that the prototype of the caller matches the prototype of the callee. This ensures that we never have to adjust the location of the return address on the stack when emitting the tail call. Maybe I forgot to check intrinsics, or the arguments that are part of the variadic pack in the verifier.

Have you considered passing these extra parameters as an operand bundle, or do those not support variable numbers of operands?

I think I actually prefer the original noinline marker patch to changing the inliner. @pcc, what do you think?

Maybe I forgot to check intrinsics, or the arguments that are part of the variadic pack in the verifier.

I had to disable the verifier check for intrinsics because the check is not appropriate for all intrinsics, specifically this one.

Have you considered passing these extra parameters as an operand bundle, or do those not support variable numbers of operands?

Something like that might work, but even if we were to represent the target list like that, that wouldn't be enough, because the backend would also need to know how to emit the other arguments as if they were being passed as part of a regular call.

That is, not being able to inline arguments into an intrinsic is effectively a backend limitation that the inliner ought to respect. If that changes, we can change the inliner as well.

So what should I do about this patch?

In D45116#1055897, @pcc wrote:

Maybe I forgot to check intrinsics, or the arguments that are part of the variadic pack in the verifier.

I had to disable the verifier check for intrinsics because the check is not appropriate for all intrinsics, specifically this one.

Have you considered passing these extra parameters as an operand bundle, or do those not support variable numbers of operands?

Something like that might work, but even if we were to represent the target list like that, that wouldn't be enough, because the backend would also need to know how to emit the other arguments as if they were being passed as part of a regular call.

That is, not being able to inline arguments into an intrinsic is effectively a backend limitation that the inliner ought to respect. If that changes, we can change the inliner as well.

I think I see what you're saying. Because this intrinsic doesn't go through the normal ISel call lowering code paths, it doesn't have any logic for getting the rest of the arguments materialized into registers or stack slots. The musttail marking is used primarily to make sure that IR passes don't screw things up and move the intrinsic call out of tail position.

The current lowering seems a bit dangerous, because if some pass inserted some computation (__fentry calls maybe?) to a branch funnel function, you haven't created virtual registers connecting the incoming argument registers to the outgoing argument registers used by the tail call, and they won't be spilled and preserved. I think fixing that is just a matter of doing this stuff when you make your TAILJMP64d instructions:

if (isVarArg && IsMustTail) {
  const auto &Forwards = X86Info->getForwardedMustTailRegParms();
  for (const auto &F : Forwards) {
    SDValue Val = DAG.getCopyFromReg(Chain, dl, F.VReg, F.VT);
    RegsToPass.push_back(std::make_pair(unsigned(F.PReg), Val));
  }
}

... except you'd add MI copies and operands instead of DAG nodes.

In D45116#1056149, @vitalybuka wrote:

So what should I do about this patch?

I think we shouldn't overgeneralize here. In the future, we can always add new intrinsics which don't work when we inline them. See llvm.local.escape. This seems like one more thing that we just don't know how to inline.

I think I would generalize HasFrameEscape to HasUninlineableIntrinsic and set that if you see a musttail call of this intrinsic. For example, I can imagine someone wanting to do guaranteed tail calls with some kind of statepoint intrinsic. We shouldn't block inlining that.

In D45116#1056213, @rnk wrote:
In D45116#1055897, @pcc wrote:

Maybe I forgot to check intrinsics, or the arguments that are part of the variadic pack in the verifier.

I had to disable the verifier check for intrinsics because the check is not appropriate for all intrinsics, specifically this one.

Have you considered passing these extra parameters as an operand bundle, or do those not support variable numbers of operands?

Something like that might work, but even if we were to represent the target list like that, that wouldn't be enough, because the backend would also need to know how to emit the other arguments as if they were being passed as part of a regular call.

That is, not being able to inline arguments into an intrinsic is effectively a backend limitation that the inliner ought to respect. If that changes, we can change the inliner as well.

I think I see what you're saying. Because this intrinsic doesn't go through the normal ISel call lowering code paths, it doesn't have any logic for getting the rest of the arguments materialized into registers or stack slots. The musttail marking is used primarily to make sure that IR passes don't screw things up and move the intrinsic call out of tail position.

The current lowering seems a bit dangerous, because if some pass inserted some computation (__fentry calls maybe?) to a branch funnel function, you haven't created virtual registers connecting the incoming argument registers to the outgoing argument registers used by the tail call, and they won't be spilled and preserved. I think fixing that is just a matter of doing this stuff when you make your TAILJMP64d instructions:
if (isVarArg && IsMustTail) {
  const auto &Forwards = X86Info->getForwardedMustTailRegParms();
  for (const auto &F : Forwards) {
    SDValue Val = DAG.getCopyFromReg(Chain, dl, F.VReg, F.VT);
    RegsToPass.push_back(std::make_pair(unsigned(F.PReg), Val));
  }
}
... except you'd add MI copies and operands instead of DAG nodes.

Are you sure that would work? The pass runs after RA.

In D45116#1056149, @vitalybuka wrote:

So what should I do about this patch?

I think we shouldn't overgeneralize here. In the future, we can always add new intrinsics which don't work when we inline them. See llvm.local.escape. This seems like one more thing that we just don't know how to inline.

I think I would generalize HasFrameEscape to HasUninlineableIntrinsic and set that if you see a musttail call of this intrinsic. For example, I can imagine someone wanting to do guaranteed tail calls with some kind of statepoint intrinsic. We shouldn't block inlining that.

That sounds fine to me.

In D45116#1056216, @pcc wrote:

Are you sure that would work? The pass runs after RA.

Good point. You might want X86ISelLowering to copy these registers into and out of the ICALL_BRANCH_FUNNEL node, though.

Don't inline @llvm.icall.branch.funnel

Harbormaster completed remote builds in B16720: Diff 141011.Apr 4 2018, 11:39 AM

Looks good to me, what do you think, Peter?

This revision is now accepted and ready to land.Apr 4 2018, 11:59 AM

LGTM

Closed by commit rL329235: Don't inline @llvm.icall.branch.funnel (authored by vitalybuka). · Explain WhyApr 4 2018, 2:50 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Analysis/

InlineCost.cpp

23 lines

test/

Transforms/

Inline/

inline-brunch-funnel.ll

35 lines

WholeProgramDevirt/

branch-funnel.ll

20 lines

Diff 141066

llvm/trunk/lib/Analysis/InlineCost.cpp

Show First 20 Lines • Show All 129 Lines • ▼ Show 20 Lines	class CallAnalyzer : public InstVisitor<CallAnalyzer, bool> {

bool IsCallerRecursive;		bool IsCallerRecursive;
bool IsRecursiveCall;		bool IsRecursiveCall;
bool ExposesReturnsTwice;		bool ExposesReturnsTwice;
bool HasDynamicAlloca;		bool HasDynamicAlloca;
bool ContainsNoDuplicateCall;		bool ContainsNoDuplicateCall;
bool HasReturn;		bool HasReturn;
bool HasIndirectBr;		bool HasIndirectBr;
bool HasFrameEscape;		bool HasUninlineableIntrinsic;
bool UsesVarArgs;		bool UsesVarArgs;

/// Number of bytes allocated statically by the callee.		/// Number of bytes allocated statically by the callee.
uint64_t AllocatedSize;		uint64_t AllocatedSize;
unsigned NumInstructions, NumVectorInstructions;		unsigned NumInstructions, NumVectorInstructions;
int VectorBonus, TenPercentVectorBonus;		int VectorBonus, TenPercentVectorBonus;
// Bonus to be applied when the callee has only one reachable basic block.		// Bonus to be applied when the callee has only one reachable basic block.
int SingleBBBonus;		int SingleBBBonus;
▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines	CallAnalyzer(const TargetTransformInfo &TTI,
: TTI(TTI), GetAssumptionCache(GetAssumptionCache), GetBFI(GetBFI),		: TTI(TTI), GetAssumptionCache(GetAssumptionCache), GetBFI(GetBFI),
PSI(PSI), F(Callee), DL(F.getParent()->getDataLayout()), ORE(ORE),		PSI(PSI), F(Callee), DL(F.getParent()->getDataLayout()), ORE(ORE),
CandidateCS(CSArg), Params(Params), Threshold(Params.DefaultThreshold),		CandidateCS(CSArg), Params(Params), Threshold(Params.DefaultThreshold),
Cost(0), ComputeFullInlineCost(OptComputeFullInlineCost \|\|		Cost(0), ComputeFullInlineCost(OptComputeFullInlineCost \|\|
Params.ComputeFullInlineCost \|\| ORE),		Params.ComputeFullInlineCost \|\| ORE),
IsCallerRecursive(false), IsRecursiveCall(false),		IsCallerRecursive(false), IsRecursiveCall(false),
ExposesReturnsTwice(false), HasDynamicAlloca(false),		ExposesReturnsTwice(false), HasDynamicAlloca(false),
ContainsNoDuplicateCall(false), HasReturn(false), HasIndirectBr(false),		ContainsNoDuplicateCall(false), HasReturn(false), HasIndirectBr(false),
HasFrameEscape(false), UsesVarArgs(false), AllocatedSize(0), NumInstructions(0),		HasUninlineableIntrinsic(false), UsesVarArgs(false), AllocatedSize(0),
NumVectorInstructions(0), VectorBonus(0), SingleBBBonus(0),		NumInstructions(0), NumVectorInstructions(0), VectorBonus(0),
EnableLoadElimination(true), LoadEliminationCost(0), NumConstantArgs(0),		SingleBBBonus(0), EnableLoadElimination(true), LoadEliminationCost(0),
NumConstantOffsetPtrArgs(0), NumAllocaArgs(0), NumConstantPtrCmps(0),		NumConstantArgs(0), NumConstantOffsetPtrArgs(0), NumAllocaArgs(0),
NumConstantPtrDiffs(0), NumInstructionsSimplified(0),		NumConstantPtrCmps(0), NumConstantPtrDiffs(0),
SROACostSavings(0), SROACostSavingsLost(0) {}		NumInstructionsSimplified(0), SROACostSavings(0),
		SROACostSavingsLost(0) {}

bool analyzeCall(CallSite CS);		bool analyzeCall(CallSite CS);

int getThreshold() { return Threshold; }		int getThreshold() { return Threshold; }
int getCost() { return Cost; }		int getCost() { return Cost; }

// Keep a bunch of stats about the cost savings found so we can print them		// Keep a bunch of stats about the cost savings found so we can print them
// out when debugging.		// out when debugging.
▲ Show 20 Lines • Show All 928 Lines • ▼ Show 20 Lines	if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(CS.getInstruction())) {
return false;		return false;

case Intrinsic::memset:		case Intrinsic::memset:
case Intrinsic::memcpy:		case Intrinsic::memcpy:
case Intrinsic::memmove:		case Intrinsic::memmove:
disableLoadElimination();		disableLoadElimination();
// SROA can usually chew through these intrinsics, but they aren't free.		// SROA can usually chew through these intrinsics, but they aren't free.
return false;		return false;
		case Intrinsic::icall_branch_funnel:
case Intrinsic::localescape:		case Intrinsic::localescape:
HasFrameEscape = true;		HasUninlineableIntrinsic = true;
return false;		return false;
case Intrinsic::vastart:		case Intrinsic::vastart:
case Intrinsic::vaend:		case Intrinsic::vaend:
UsesVarArgs = true;		UsesVarArgs = true;
return false;		return false;
}		}
}		}

▲ Show 20 Lines • Show All 323 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I) {
if (Base::visit(&*I))		if (Base::visit(&*I))
++NumInstructionsSimplified;		++NumInstructionsSimplified;
else		else
Cost += InlineConstants::InstrCost;		Cost += InlineConstants::InstrCost;

using namespace ore;		using namespace ore;
// If the visit this instruction detected an uninlinable pattern, abort.		// If the visit this instruction detected an uninlinable pattern, abort.
if (IsRecursiveCall \|\| ExposesReturnsTwice \|\| HasDynamicAlloca \|\|		if (IsRecursiveCall \|\| ExposesReturnsTwice \|\| HasDynamicAlloca \|\|
HasIndirectBr \|\| HasFrameEscape \|\| UsesVarArgs) {		HasIndirectBr \|\| HasUninlineableIntrinsic \|\| UsesVarArgs) {
if (ORE)		if (ORE)
ORE->emit([&]() {		ORE->emit([&]() {
return OptimizationRemarkMissed(DEBUG_TYPE, "NeverInline",		return OptimizationRemarkMissed(DEBUG_TYPE, "NeverInline",
CandidateCS.getInstruction())		CandidateCS.getInstruction())
<< NV("Callee", &F)		<< NV("Callee", &F)
<< " has uninlinable pattern and cost is not fully computed";		<< " has uninlinable pattern and cost is not fully computed";
});		});
return false;		return false;
▲ Show 20 Lines • Show All 455 Lines • ▼ Show 20 Lines	for (auto &II : *BI) {
if (!ReturnsTwice && CS.isCall() &&		if (!ReturnsTwice && CS.isCall() &&
cast<CallInst>(CS.getInstruction())->canReturnTwice())		cast<CallInst>(CS.getInstruction())->canReturnTwice())
return false;		return false;

if (CS.getCalledFunction())		if (CS.getCalledFunction())
switch (CS.getCalledFunction()->getIntrinsicID()) {		switch (CS.getCalledFunction()->getIntrinsicID()) {
default:		default:
break;		break;
		// Disallow inlining of @llvm.icall.branch.funnel because current
		// backend can't separate call targets from call arguments.
		case llvm::Intrinsic::icall_branch_funnel:
// Disallow inlining functions that call @llvm.localescape. Doing this		// Disallow inlining functions that call @llvm.localescape. Doing this
// correctly would require major changes to the inliner.		// correctly would require major changes to the inliner.
case llvm::Intrinsic::localescape:		case llvm::Intrinsic::localescape:
// Disallow inlining of functions that access VarArgs.		// Disallow inlining of functions that access VarArgs.
case llvm::Intrinsic::vastart:		case llvm::Intrinsic::vastart:
case llvm::Intrinsic::vaend:		case llvm::Intrinsic::vaend:
return false;		return false;
}		}
▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/Inline/inline-brunch-funnel.ll

				; Test that inliner skips @llvm.icall.branch.funnel
				; RUN: opt < %s -inline -S \| FileCheck %s

				target datalayout = "e-p:64:64"
				target triple = "x86_64-unknown-linux-gnu"

				declare void @llvm.icall.branch.funnel(...)

				; CHECK-LABEL: define void @fn_musttail(
				define void @fn_musttail() {
				call void (...) @bf_musttail()
				; CHECK: call void (...) @bf_musttail(
				ret void
				}

				; CHECK-LABEL: define internal void @bf_musttail(
				define internal void @bf_musttail(...) {
				musttail call void (...) @llvm.icall.branch.funnel(...)
				; CHECK: musttail call void (...) @llvm.icall.branch.funnel(
				ret void
				}

				; CHECK-LABEL: define void @fn_musttail_always(
				define void @fn_musttail_always() {
				call void (...) @bf_musttail_always()
				; CHECK: call void (...) @bf_musttail_always(
				ret void
				}

				; CHECK-LABEL: define internal void @bf_musttail_always(
				define internal void @bf_musttail_always(...) alwaysinline {
				musttail call void (...) @llvm.icall.branch.funnel(...)
				; CHECK: musttail call void (...) @llvm.icall.branch.funnel(
				ret void
				}

llvm/trunk/test/Transforms/WholeProgramDevirt/branch-funnel.ll

	; RUN: opt -S -wholeprogramdevirt %s \| FileCheck --check-prefixes=CHECK,RETP %s			; RUN: opt -S -wholeprogramdevirt %s \| FileCheck --check-prefixes=CHECK,RETP %s
	; RUN: sed -e 's,+retpoline,-retpoline,g' %s \| opt -S -wholeprogramdevirt \| FileCheck --check-prefixes=CHECK,NORETP %s			; RUN: sed -e 's,+retpoline,-retpoline,g' %s \| opt -S -wholeprogramdevirt \| FileCheck --check-prefixes=CHECK,NORETP %s

	; RUN: opt -wholeprogramdevirt -wholeprogramdevirt-summary-action=export -wholeprogramdevirt-read-summary=%S/Inputs/export.yaml -wholeprogramdevirt-write-summary=%t -S -o - %s \| FileCheck --check-prefixes=CHECK,RETP %s			; RUN: opt -wholeprogramdevirt -wholeprogramdevirt-summary-action=export -wholeprogramdevirt-read-summary=%S/Inputs/export.yaml -wholeprogramdevirt-write-summary=%t -S -o - %s \| FileCheck --check-prefixes=CHECK,RETP %s

				; RUN: opt -wholeprogramdevirt -wholeprogramdevirt-summary-action=export -wholeprogramdevirt-read-summary=%S/Inputs/export.yaml -wholeprogramdevirt-write-summary=%t -O3 -S -o - %s \| FileCheck --check-prefixes=CHECK %s

	; RUN: FileCheck --check-prefix=SUMMARY %s < %t			; RUN: FileCheck --check-prefix=SUMMARY %s < %t

	; SUMMARY: TypeIdMap:			; SUMMARY: TypeIdMap:
	; SUMMARY-NEXT: typeid1:			; SUMMARY-NEXT: typeid1:
	; SUMMARY-NEXT: TTRes:			; SUMMARY-NEXT: TTRes:
	; SUMMARY-NEXT: Kind: Unsat			; SUMMARY-NEXT: Kind: Unsat
	; SUMMARY-NEXT: SizeM1BitWidth: 0			; SUMMARY-NEXT: SizeM1BitWidth: 0
	; SUMMARY-NEXT: AlignLog2: 0			; SUMMARY-NEXT: AlignLog2: 0
	▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	declare i32 @vf3_2(i8* %this, i32 %arg)			declare i32 @vf3_2(i8* %this, i32 %arg)

	@vt4_1 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf4_1 to i8*)], !type !3			@vt4_1 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf4_1 to i8*)], !type !3
	@vt4_2 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf4_2 to i8*)], !type !3			@vt4_2 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf4_2 to i8*)], !type !3

	declare i32 @vf4_1(i8* %this, i32 %arg)			declare i32 @vf4_1(i8* %this, i32 %arg)
	declare i32 @vf4_2(i8* %this, i32 %arg)			declare i32 @vf4_2(i8* %this, i32 %arg)

	; CHECK: define i32 @fn1

				; CHECK-LABEL: define i32 @fn1
				; CHECK-NOT: call void (...) @llvm.icall.branch.funnel
	define i32 @fn1(i8* %obj) #0 {			define i32 @fn1(i8* %obj) #0 {
	%vtableptr = bitcast i8* %obj to [1 x i8]*			%vtableptr = bitcast i8* %obj to [1 x i8]*
	%vtable = load [1 x i8], [1 x i8]* %vtableptr			%vtable = load [1 x i8], [1 x i8]* %vtableptr
	%vtablei8 = bitcast [1 x i8] %vtable to i8*			%vtablei8 = bitcast [1 x i8] %vtable to i8*
	%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !"typeid1")			%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !"typeid1")
	call void @llvm.assume(i1 %p)			call void @llvm.assume(i1 %p)
	%fptrptr = getelementptr [1 x i8], [1 x i8]* %vtable, i32 0, i32 0			%fptrptr = getelementptr [1 x i8], [1 x i8]* %vtable, i32 0, i32 0
	%fptr = load i8, i8* %fptrptr			%fptr = load i8, i8* %fptrptr
	%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)			%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)
	; RETP: {{.}} = bitcast {{.}} to i8*			; RETP: {{.}} = bitcast {{.}} to i8*
	; RETP: [[VT1:%.]] = bitcast {{.}} to i8*			; RETP: [[VT1:%.]] = bitcast {{.}} to i8*
	; RETP: call i32 bitcast (void (i8, ...) @__typeid_typeid1_0_branch_funnel to i32 (i8, i8, i32))(i8 nest [[VT1]], i8* %obj, i32 1)			; RETP: call i32 bitcast (void (i8, ...) @__typeid_typeid1_0_branch_funnel to i32 (i8, i8, i32))(i8 nest [[VT1]], i8* %obj, i32 1)
	%result = call i32 %fptr_casted(i8* %obj, i32 1)			%result = call i32 %fptr_casted(i8* %obj, i32 1)
	; NORETP: call i32 %			; NORETP: call i32 %
	ret i32 %result			ret i32 %result
	}			}

	; CHECK: define i32 @fn2			; CHECK-LABEL: define i32 @fn2
				; CHECK-NOT: call void (...) @llvm.icall.branch.funnel
	define i32 @fn2(i8* %obj) #0 {			define i32 @fn2(i8* %obj) #0 {
	%vtableptr = bitcast i8* %obj to [1 x i8]*			%vtableptr = bitcast i8* %obj to [1 x i8]*
	%vtable = load [1 x i8], [1 x i8]* %vtableptr			%vtable = load [1 x i8], [1 x i8]* %vtableptr
	%vtablei8 = bitcast [1 x i8] %vtable to i8*			%vtablei8 = bitcast [1 x i8] %vtable to i8*
	%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !"typeid2")			%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !"typeid2")
	call void @llvm.assume(i1 %p)			call void @llvm.assume(i1 %p)
	%fptrptr = getelementptr [1 x i8], [1 x i8]* %vtable, i32 0, i32 0			%fptrptr = getelementptr [1 x i8], [1 x i8]* %vtable, i32 0, i32 0
	%fptr = load i8, i8* %fptrptr			%fptr = load i8, i8* %fptrptr
	%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)			%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)
	; CHECK: call i32 %			; CHECK: call i32 %
	%result = call i32 %fptr_casted(i8* %obj, i32 1)			%result = call i32 %fptr_casted(i8* %obj, i32 1)
	ret i32 %result			ret i32 %result
	}			}

	; CHECK: define i32 @fn3			; CHECK-LABEL: define i32 @fn3
				; CHECK-NOT: call void (...) @llvm.icall.branch.funnel
	define i32 @fn3(i8* %obj) #0 {			define i32 @fn3(i8* %obj) #0 {
	%vtableptr = bitcast i8* %obj to [1 x i8]*			%vtableptr = bitcast i8* %obj to [1 x i8]*
	%vtable = load [1 x i8], [1 x i8]* %vtableptr			%vtable = load [1 x i8], [1 x i8]* %vtableptr
	%vtablei8 = bitcast [1 x i8] %vtable to i8*			%vtablei8 = bitcast [1 x i8] %vtable to i8*
	%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !4)			%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !4)
	call void @llvm.assume(i1 %p)			call void @llvm.assume(i1 %p)
	%fptrptr = getelementptr [1 x i8], [1 x i8]* %vtable, i32 0, i32 0			%fptrptr = getelementptr [1 x i8], [1 x i8]* %vtable, i32 0, i32 0
	%fptr = load i8, i8* %fptrptr			%fptr = load i8, i8* %fptrptr
	%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)			%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)
	; RETP: call i32 bitcast (void (i8, ...) @branch_funnel to			; RETP: call i32 bitcast (void (i8, ...) @branch_funnel to
	; NORETP: call i32 %			; NORETP: call i32 %
	%result = call i32 %fptr_casted(i8* %obj, i32 1)			%result = call i32 %fptr_casted(i8* %obj, i32 1)
	ret i32 %result			ret i32 %result
	}			}

	; CHECK: define internal void @branch_funnel(i8* nest, ...)			; CHECK-LABEL: define internal void @branch_funnel(i8*

	; CHECK: define hidden void @__typeid_typeid1_0_branch_funnel(i8* nest, ...)			; CHECK: define hidden void @__typeid_typeid1_0_branch_funnel(i8* nest, ...)
	; CHECK-NEXT: call void (...) @llvm.icall.branch.funnel(i8* %0, i8* bitcast ([1 x i8] @vt1_1 to i8), i32 (i8, i32)* @vf1_1, i8* bitcast ([1 x i8] @vt1_2 to i8), i32 (i8, i32)* @vf1_2, ...)			; CHECK-NEXT: musttail call void (...) @llvm.icall.branch.funnel(i8* %0, i8* bitcast ([1 x i8] {{(nonnull )?}}@vt1_1 to i8), i32 (i8, i32)* {{(nonnull )?}}@vf1_1, i8* bitcast ([1 x i8] {{(nonnull )?}}@vt1_2 to i8), i32 (i8, i32)* {{(nonnull )?}}@vf1_2, ...)

	declare i1 @llvm.type.test(i8*, metadata)			declare i1 @llvm.type.test(i8*, metadata)
	declare void @llvm.assume(i1)			declare void @llvm.assume(i1)

	!0 = !{i32 0, !"typeid1"}			!0 = !{i32 0, !"typeid1"}
	!1 = !{i32 0, !"typeid2"}			!1 = !{i32 0, !"typeid2"}
	!2 = !{i32 0, !"typeid3"}			!2 = !{i32 0, !"typeid3"}
	!3 = !{i32 0, !4}			!3 = !{i32 0, !4}
	!4 = distinct !{}			!4 = distinct !{}

	attributes #0 = { "target-features"="+retpoline" }			attributes #0 = { "target-features"="+retpoline" }

This is an archive of the discontinued LLVM Phabricator instance.

Don't inline branch funnelsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 141066

llvm/trunk/lib/Analysis/InlineCost.cpp

llvm/trunk/test/Transforms/Inline/inline-brunch-funnel.ll

llvm/trunk/test/Transforms/WholeProgramDevirt/branch-funnel.ll

Don't inline branch funnels
ClosedPublic