This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
5/10
LangRef.rst
-
include/llvm/IR/
-
llvm/
-
IR/
-
BasicBlock.h
-
CallSite.h
-
Intrinsics.td
-
lib/
-
IR/
-
BasicBlock.cpp
-
Verifier.cpp
-
Transforms/Utils/
-
Utils/
3/7
InlineFunction.cpp
-
test/
-
Transforms/Inline/
-
Inline/
-
deoptimize-intrinsic.ll
-
Verifier/
-
deoptimize-intrinsic.ll

Differential D17732

Introduce @llvm.experimental.deoptimize
ClosedPublic

Authored by sanjoy on Feb 29 2016, 1:43 PM.

Download Raw Diff

Details

Reviewers

chandlerc
reames
atrick
rnk

Commits

rGb51325dbdb10: Introduce @llvm.experimental.deoptimize
rL263281: Introduce @llvm.experimental.deoptimize

Summary

This intrinsic, together with deoptimization operand bundles, allow
frontends to express transfer of control and frame-local state from
one (typically more specialized, hence faster) version of a function
into another (typically more generic, hence slower) version.

In languages with a fully integrated managed runtime this intrinsic can
be used to implement "uncommon trap" like functionality. In unmanaged
languages like C and C++, this intrinsic can be used to represent the
slow paths of specialized functions.

Diff Detail

Event Timeline

sanjoy updated this revision to Diff 49412.Feb 29 2016, 1:43 PM

sanjoy retitled this revision from to Introduce @llvm.experimental.deoptimize.

sanjoy updated this object.

sanjoy added reviewers: atrick, chandlerc, rnk, reames.

sanjoy added subscribers: JosephTremoulet, maksfb, mjacob and 2 others.

Herald added a subscriber: mcrosier. · View Herald TranscriptFeb 29 2016, 1:43 PM

chandlerc added inline comments.Feb 29 2016, 1:54 PM

docs/LangRef.rst
12147–12148	Would it be more precise to say that it must be a 'musttail' call?

sanjoy added inline comments.Feb 29 2016, 2:19 PM

docs/LangRef.rst
12147–12148	At the IR level, semantically, it does "look like" a `musttail` call, but at the ABI level it is not a `musttail` call from LLVM's perspective. Usually the caller frame will have state that is needed to transition into the deopt continuation, and so it is not okay for LLVM to lower this call as "pop frame and jump". This is why I'm hesitant to specify that calls to experimental.deoptimize have to be `musttail`. On the other hand, managed languages typically do have a guarantee that once we've fully transitioned into the generic / de-specialized function, the caller frame will have been fully popped; so it is a lot like a `musttail` call from the "eventual stack growth" perspective. A second minor difference is that the restrictions are a little stricter -- because the intrinsic is polymorphic on the return value, there is no need to allow the optional `bitcast` (this part is fairly inconsequential though). So the summary is that I'm somewhat on the fence on making this a `musttail` call (or not), and given that changing our mind later would not involve a lot of interesting code-churn, I'd like to defer that decision to a later point. However if either of the points in the first paragraph strongly resonate with you please let me know.

ping?

It looks like this is missing codegen support. For now, I'd suggest lowering to call to a known symbol. __llvm_deoptimize?

docs/LangRef.rst
12107	Naming wise, I liked the term side exit much better.
12115	Might be better to declare this as taking unspecified arguments which are interpreted by the runtime.
12124	As written, this explanation is too generic. The intrinsic specifically only allows transfer of control in one direction - leaving the current code. It does not allow entries. Having a separate intrinsic which modelled OSR entry points might some day be useful, but that's a different problem.
12126	Java, or JavaScript
12127	or "side exit"
12143	Might be good to say something about the interpretation of deopt continuation as being explicitly out of scope for LLVM. (i.e. go invoke random code which must...)
lib/Transforms/Utils/InlineFunction.cpp
1806	Can't you just skip the entire region if the return type of the caller is the same as the callee? Actually, no, you need to remove them from the normal returns so that we early terminate the parent as well. Can you restructure/add comments to make both parts clear?

I was also confused by the missing codegen support. Otherwise looks fine to me. I'll let Philip decide when it's ready to accept.

sanjoy updated this object.Mar 7 2016, 3:53 PM

sanjoy edited edge metadata.

Add a lowering strategy to FastISel and SelectionDAG -- the intrinsic is lowered to __llvm_deoptimize.
The intrinsic now takes an argument of unspecified type. The contract of what the argument means and how __llvm_deoptimize interprets it is open-ended and up to the frontend.

Herald added a subscriber: MatzeB. · View Herald TranscriptMar 7 2016, 3:54 PM

sanjoy added inline comments.Mar 7 2016, 3:55 PM

docs/LangRef.rst
12107	I'm probably going to rename `@guard_on` as `@guarded_side_exit`.

My previous documentation comments have not been addressed.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5447 ↗	(On Diff #50005)	possibly we should have experimental in the name? __llvm_experimental_deoptimize?
lib/Transforms/Utils/InlineFunction.cpp
1814	minor: the implicit bool conversion here is confusing. nullptr != X would be clearer. Alternatively, have the lambda explicitly return bool.
1829	I'd intended in my previous comment to indicate that the intrinsic should take an unlimited number of untyped arguments. Not one untyped argument.

This revision now requires changes to proceed.Mar 9 2016, 12:10 PM

Re: doc comments -- I think did address all of them, can you point out the ones that you are not okay with in the current form?

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5447 ↗	(On Diff #50005)	Good idea, will fix.
lib/Transforms/Utils/InlineFunction.cpp
1814	SGTM, will fix shortly.
1829	I thought about that -- having a varargs signature is not more general (since you can just create an aggregate type / alloca and pass that in), and did not feel like it was worth the complexity (e.g. we'd probably not want the call to be a varargs call at the ABI level).

In D17732#371098, @sanjoy wrote:

Re: doc comments -- I think did address all of them, can you point out the ones that you are not okay with in the current form?

You're correct. For some reason the doc changes didn't show up in the diff between versions (or I just miss them). All of the doc comments have been addressed satisfactorily.

With the current comments addressed, LGTM.

lib/Transforms/Utils/InlineFunction.cpp
1829	Hm, I disagree with you, but let's separate this out. For this review, let's use a form with no arguments. We can extend it later to support the multiple argument form if desired. That can be done as a follow up change. (I specifically don't want the one argument form going in because it requires one argument when a completely legitimate runtime might not need any.)

This revision is now accepted and ready to land.Mar 9 2016, 12:36 PM

The lowering I'm doing here is basically incorrect. The lowering should be going through RS4GC / gc.statepoint (since that is the only way LLC can understand deopt state). There is also a missing assert in SelectionDAG and FastISel -- at this point they do not know how to lower operand bundles, and should assert if they do see a call with operand bundle (had this assert been there I'd not have misled myself).

Also, given that we'll go through RS4GC, it makes the "lower varargs to regular args" bit fairly easy.

Given that the lowering bits will be a little more complex now, what do you think about the following plan:

Rip out the lowering parts from this change
Change the spec to have deoptimize take varargs
Introduce lowering via RS4GC in a separate change

(I'll also add some assertions to SelectionDAG and FastISel to fail on calls with operand bundles, but that's not directly relevant here)

Sanjoy and I talked about this offline; let me summarize what we talked about.

I expressed discomfort with reusing RS4GC as Sanjoy suggested. While this would seem to be the path of least resistance, it seemed odd to me to tie deopt and GC so closely coupled together. We clearly need to reuse the backend representation (i.e. the STATEPOINT pseudo op) since we need the stack lowering and stack map entry pieces, but reusing the IR rewriting/lowering code seemed overly coupled.

After discussion, we settled on the following approach. Sanjoy is going to examine whether we could reuse the StatepointLowering.cpp code directly from SelectionDAGBuilder's intrinsic lowering code. This would imply that we don't need to wrap a call to @llvm.experimental.deoptimize in an statepoint to get it lowered to a STATEPOINT psuedo op. It also gives us the building block for a possible future change which removes the deopt arguments from the statepoint intrinsic entirely in favour of a deopt bundle attached to the call to the statepoint.

(Worth noting, we could have used a PATCHPOINT psuedo op and not a STATEPOINT one for the lowering of the deoptimize call. The difference is live-on-call/in-regs vs live-through-call/on-stack semantics for the deopt arguments. We decided to use the STATEPOINT pseudo op mostly because that's what we have more experience with. We may revisit this lowering detail in the future, but it wasn't worth the churn right now.)

p.s. I'm deferring to Sanjoy on whether he wants to do everything within this patch or separate the lowering into it's own patch. He'll make that decision and potentially land this patch without the lowering code or argument handling. I've already given an LGTM for that part and that still stands.

This revision

Fixes the nits pointed out earlier
Removes lowering code (that will be addressed later, as Philip mentioned)
Changes the signature to varargs
Adds support for calls to @llvm.experimental.deoptimize in a function being invoke d, and relevant test cases

The last bit is a semantic change that hasn't been LGTM'ed or
discussed previously; hence this re-review.

ping!

The additional handling for inlining invokes to functions that contain deoptimize calls is fairly straightforward, it'd be great to get a quick re-LGTM on this.

I looked at the inliner stuff, and it looks good to me.

docs/LangRef.rst
12147–12148	I don't think it would be useful to mark these as `musttail`. You will end up fighting with the existing musttail logic, which still merges returns after a tail call into normal control flow.
lib/Transforms/Utils/InlineFunction.cpp
1832–1834	Maybe do this outside the loop to save some lookups.

Closed by commit rL263281: Introduce @llvm.experimental.deoptimize (authored by sanjoy). · Explain WhyMar 11 2016, 11:13 AM

This revision was automatically updated to reflect the committed changes.

sanjoy marked an inline comment as done.

Revision Contents

Path

Size

docs/

LangRef.rst

58 lines

include/

llvm/

IR/

BasicBlock.h

8 lines

CallSite.h

4 lines

Intrinsics.td

3 lines

lib/

IR/

BasicBlock.cpp

15 lines

Verifier.cpp

23 lines

Transforms/

Utils/

InlineFunction.cpp

45 lines

test/

Transforms/

Inline/

deoptimize-intrinsic.ll

38 lines

Verifier/

deoptimize-intrinsic.ll

42 lines

Diff 49412

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,528 Lines • ▼ Show 20 Lines	- Calls and invokes with operand bundles have unknown read / write
``readnone`` or ``readonly``), unless they're overridden with		``readnone`` or ``readonly``), unless they're overridden with
callsite specific attributes.		callsite specific attributes.
- An operand bundle at a call site cannot change the implementation		- An operand bundle at a call site cannot change the implementation
of the called function. Inter-procedural optimizations work as		of the called function. Inter-procedural optimizations work as
usual as long as they take into account the first two properties.		usual as long as they take into account the first two properties.

More specific types of operand bundles are described below.		More specific types of operand bundles are described below.

		.. _deopt_opbundles:

Deoptimization Operand Bundles		Deoptimization Operand Bundles
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Deoptimization operand bundles are characterized by the ``"deopt"``		Deoptimization operand bundles are characterized by the ``"deopt"``
operand bundle tag. These operand bundles represent an alternate		operand bundle tag. These operand bundles represent an alternate
"safe" continuation for the call site they're attached to, and can be		"safe" continuation for the call site they're attached to, and can be
used by a suitable runtime to deoptimize the compiled frame at the		used by a suitable runtime to deoptimize the compiled frame at the
specified call site. There can be at most one ``"deopt"`` operand		specified call site. There can be at most one ``"deopt"`` operand
▲ Show 20 Lines • Show All 10,552 Lines • ▼ Show 20 Lines
None.		None.

Semantics:		Semantics:
""""""""""		""""""""""

This intrinsic does nothing, and it's removed by optimizers and ignored		This intrinsic does nothing, and it's removed by optimizers and ignored
by codegen.		by codegen.

		'``llvm.experimental.deoptimize``' Intrinsic
		reamesUnsubmitted Not Done Reply Inline Actions Naming wise, I liked the term side exit much better. reames: Naming wise, I liked the term side exit much better.
		sanjoyAuthorUnsubmitted Not Done Reply Inline Actions I'm probably going to rename `@guard_on` as `@guarded_side_exit`. sanjoy: I'm probably going to rename `@guard_on` as `@guarded_side_exit`.
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""

		::

		declare type @llvm.experimental.deoptimize()
		reamesUnsubmitted Done Reply Inline Actions Might be better to declare this as taking unspecified arguments which are interpreted by the runtime. reames: Might be better to declare this as taking unspecified arguments which are interpreted by the…

		Overview:
		"""""""""

		This intrinsic, together with :ref:`deoptimization operand bundles
		<deopt_opbundles>`, allow frontends to express transfer of control and
		frame-local state from one (typically more specialized, hence faster)
		version of a function into another (typically more generic, hence
		slower) version.
		reamesUnsubmitted Done Reply Inline Actions As written, this explanation is too generic. The intrinsic specifically only allows transfer of control in one direction - leaving the current code. It does not allow entries. Having a separate intrinsic which modelled OSR entry points might some day be useful, but that's a different problem. reames: As written, this explanation is too generic. The intrinsic specifically only allows transfer…

		In languages with a fully integrated managed runtime like Java this
		reamesUnsubmitted Done Reply Inline Actions Java, or JavaScript reames: Java, or JavaScript
		intrinsic can be used to implement "uncommon trap" like functionality.
		reamesUnsubmitted Done Reply Inline Actions or "side exit" reames: or "side exit"
		In unmanaged languages like C and C++, this intrinsic can be used to
		represent the slow paths of specialized functions.


		Arguments:
		""""""""""

		None.

		Semantics:
		""""""""""

		The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
		deoptimization continuation (denoted using a :ref:`deoptimization
		operand bundle <deopt_opbundles>`) and returns the value returned by
		the deoptimization continuation.
		reamesUnsubmitted Done Reply Inline Actions Might be good to say something about the interpretation of deopt continuation as being explicitly out of scope for LLVM. (i.e. go invoke random code which must...) reames: Might be good to say something about the interpretation of deopt continuation as being…

		Deoptimization continuations expressed using ``"deopt"`` operand bundles always
		continue execution to the end of the physical frame containing them, so all
		calls to ``@llvm.experimental.deoptimize`` must be in "tail
		position":
		chandlercUnsubmitted Not Done Reply Inline Actions Would it be more precise to say that it must be a 'musttail' call? chandlerc: Would it be more precise to say that it must be a 'musttail' call?
		sanjoyAuthorUnsubmitted Not Done Reply Inline Actions At the IR level, semantically, it does "look like" a `musttail` call, but at the ABI level it is not a `musttail` call from LLVM's perspective. Usually the caller frame will have state that is needed to transition into the deopt continuation, and so it is not okay for LLVM to lower this call as "pop frame and jump". This is why I'm hesitant to specify that calls to experimental.deoptimize have to be `musttail`. On the other hand, managed languages typically do have a guarantee that once we've fully transitioned into the generic / de-specialized function, the caller frame will have been fully popped; so it is a lot like a `musttail` call from the "eventual stack growth" perspective. A second minor difference is that the restrictions are a little stricter -- because the intrinsic is polymorphic on the return value, there is no need to allow the optional `bitcast` (this part is fairly inconsequential though). So the summary is that I'm somewhat on the fence on making this a `musttail` call (or not), and given that changing our mind later would not involve a lot of interesting code-churn, I'd like to defer that decision to a later point. However if either of the points in the first paragraph strongly resonate with you please let me know. sanjoy: At the IR level, semantically, it does "look like" a `musttail` call, but at the ABI level it…
		rnkUnsubmitted Not Done Reply Inline Actions I don't think it would be useful to mark these as `musttail`. You will end up fighting with the existing musttail logic, which still merges returns after a tail call into normal control flow. rnk: I don't think it would be useful to mark these as `musttail`. You will end up fighting with the…

		- ``@llvm.experimental.deoptimize`` cannot be invoked.
		- The call must immediately precede a :ref:`ret <i_ret>` instruction.
		- The ret instruction must return the value produced by the
		``@llvm.experimental.deoptimize`` call if there is one, or void.

		Note that the above restrictions imply that the return type for a call
		to ``@llvm.experimental.deoptimize`` will match the return type of its
		immediate caller.

		The inliner composes the ``"deopt"`` continuations of the caller into the
		``"deopt"`` continuations present in the inlinee, and also updates calls to this
		intrinsic to return directly from the frame of the function it inlined into.

Stack Map Intrinsics		Stack Map Intrinsics
--------------------		--------------------

LLVM provides experimental intrinsics to support runtime patching		LLVM provides experimental intrinsics to support runtime patching
mechanisms commonly desired in dynamic language JITs. These intrinsics		mechanisms commonly desired in dynamic language JITs. These intrinsics
are described in :doc:`StackMaps`.		are described in :doc:`StackMaps`.

include/llvm/IR/BasicBlock.h

Show First 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	public:
const Module *getModule() const;		const Module *getModule() const;
Module *getModule();		Module *getModule();

/// \brief Returns the terminator instruction if the block is well formed or		/// \brief Returns the terminator instruction if the block is well formed or
/// null if the block is not well formed.		/// null if the block is not well formed.
TerminatorInst *getTerminator();		TerminatorInst *getTerminator();
const TerminatorInst *getTerminator() const;		const TerminatorInst *getTerminator() const;

		/// \brief Returns the call instruction calling @llvm.experimental.deoptimize
		/// prior to the terminating return instruction of this basic block, if such a
		/// call is present. Otherwise, returns null.
		CallInst *getTerminatingDeoptimizeCall();
		const CallInst *getTerminatingDeoptimizeCall() const {
		return const_cast<BasicBlock *>(this)->getTerminatingDeoptimizeCall();
		}

/// \brief Returns the call instruction marked 'musttail' prior to the		/// \brief Returns the call instruction marked 'musttail' prior to the
/// terminating return instruction of this basic block, if such a call is		/// terminating return instruction of this basic block, if such a call is
/// present. Otherwise, returns null.		/// present. Otherwise, returns null.
CallInst *getTerminatingMustTailCall();		CallInst *getTerminatingMustTailCall();
const CallInst *getTerminatingMustTailCall() const {		const CallInst *getTerminatingMustTailCall() const {
return const_cast<BasicBlock *>(this)->getTerminatingMustTailCall();		return const_cast<BasicBlock *>(this)->getTerminatingMustTailCall();
}		}

▲ Show 20 Lines • Show All 220 Lines • Show Last 20 Lines

include/llvm/IR/CallSite.h

Show First 20 Lines • Show All 447 Lines • ▼ Show 20 Lines	#define CALLSITE_DELEGATE_SETTER(METHOD) \
Optional<OperandBundleUse> getOperandBundle(StringRef Name) const {		Optional<OperandBundleUse> getOperandBundle(StringRef Name) const {
CALLSITE_DELEGATE_GETTER(getOperandBundle(Name));		CALLSITE_DELEGATE_GETTER(getOperandBundle(Name));
}		}

Optional<OperandBundleUse> getOperandBundle(uint32_t ID) const {		Optional<OperandBundleUse> getOperandBundle(uint32_t ID) const {
CALLSITE_DELEGATE_GETTER(getOperandBundle(ID));		CALLSITE_DELEGATE_GETTER(getOperandBundle(ID));
}		}

		unsigned countOperandBundlesOfType(uint32_t ID) const {
		CALLSITE_DELEGATE_GETTER(countOperandBundlesOfType(ID));
		}

IterTy arg_begin() const {		IterTy arg_begin() const {
CALLSITE_DELEGATE_GETTER(arg_begin());		CALLSITE_DELEGATE_GETTER(arg_begin());
}		}

IterTy arg_end() const {		IterTy arg_end() const {
CALLSITE_DELEGATE_GETTER(arg_end());		CALLSITE_DELEGATE_GETTER(arg_end());
}		}

▲ Show 20 Lines • Show All 116 Lines • Show Last 20 Lines

include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 587 Lines • ▼ Show 20 Lines
	//			//
	def int_flt_rounds : Intrinsic<[llvm_i32_ty]>,			def int_flt_rounds : Intrinsic<[llvm_i32_ty]>,
	GCCBuiltin<"__builtin_flt_rounds">;			GCCBuiltin<"__builtin_flt_rounds">;
	def int_trap : Intrinsic<[], [], [IntrNoReturn]>,			def int_trap : Intrinsic<[], [], [IntrNoReturn]>,
	GCCBuiltin<"__builtin_trap">;			GCCBuiltin<"__builtin_trap">;
	def int_debugtrap : Intrinsic<[]>,			def int_debugtrap : Intrinsic<[]>,
	GCCBuiltin<"__builtin_debugtrap">;			GCCBuiltin<"__builtin_debugtrap">;

				// Support for dynamic deoptimization (or de-specialization)
				def int_experimental_deoptimize : Intrinsic<[llvm_any_ty], [], []>;

	// NOP: calls/invokes to this intrinsic are removed by codegen			// NOP: calls/invokes to this intrinsic are removed by codegen
	def int_donothing : Intrinsic<[], [], [IntrNoMem]>;			def int_donothing : Intrinsic<[], [], [IntrNoMem]>;

	// Intrisics to support half precision floating point format			// Intrisics to support half precision floating point format
	let IntrProperties = [IntrNoMem] in {			let IntrProperties = [IntrNoMem] in {
	def int_convert_to_fp16 : Intrinsic<[llvm_i16_ty], [llvm_anyfloat_ty]>;			def int_convert_to_fp16 : Intrinsic<[llvm_i16_ty], [llvm_anyfloat_ty]>;
	def int_convert_from_fp16 : Intrinsic<[llvm_anyfloat_ty], [llvm_i16_ty]>;			def int_convert_from_fp16 : Intrinsic<[llvm_anyfloat_ty], [llvm_i16_ty]>;
	}			}
	▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

lib/IR/BasicBlock.cpp

Show First 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	CallInst *BasicBlock::getTerminatingMustTailCall() {

if (auto *CI = dyn_cast<CallInst>(Prev)) {		if (auto *CI = dyn_cast<CallInst>(Prev)) {
if (CI->isMustTailCall())		if (CI->isMustTailCall())
return CI;		return CI;
}		}
return nullptr;		return nullptr;
}		}

		CallInst *BasicBlock::getTerminatingDeoptimizeCall() {
		if (InstList.empty())
		return nullptr;
		auto *RI = dyn_cast<ReturnInst>(&InstList.back());
		if (!RI \|\| RI == &InstList.front())
		return nullptr;

		if (auto *CI = dyn_cast_or_null<CallInst>(RI->getPrevNode()))
		if (Function *F = CI->getCalledFunction())
		if (F->getIntrinsicID() == Intrinsic::experimental_deoptimize)
		return CI;

		return nullptr;
		}

Instruction* BasicBlock::getFirstNonPHI() {		Instruction* BasicBlock::getFirstNonPHI() {
for (Instruction &I : *this)		for (Instruction &I : *this)
if (!isa<PHINode>(I))		if (!isa<PHINode>(I))
return &I;		return &I;
return nullptr;		return nullptr;
}		}

Instruction* BasicBlock::getFirstNonPHIOrDbg() {		Instruction* BasicBlock::getFirstNonPHIOrDbg() {
▲ Show 20 Lines • Show All 259 Lines • Show Last 20 Lines

lib/IR/Verifier.cpp

Show First 20 Lines • Show All 4,070 Lines • ▼ Show 20 Lines	case Intrinsic::masked_store: {
Type *DataTy = cast<PointerType>(Ptr->getType())->getElementType();		Type *DataTy = cast<PointerType>(Ptr->getType())->getElementType();
Assert(DataTy == Val->getType(),		Assert(DataTy == Val->getType(),
"masked_store: storee must match pointer type", CS);		"masked_store: storee must match pointer type", CS);
Assert(Mask->getType()->getVectorNumElements() ==		Assert(Mask->getType()->getVectorNumElements() ==
DataTy->getVectorNumElements(),		DataTy->getVectorNumElements(),
"masked_store: vector mask must be same length as data", CS);		"masked_store: vector mask must be same length as data", CS);
break;		break;
}		}

		case Intrinsic::experimental_deoptimize: {
		Assert(CS.isCall(), "experimental_deoptimize cannot be invoked", CS);
		Assert(CS.countOperandBundlesOfType(LLVMContext::OB_deopt) == 1,
		"experimental_deoptimize must have exactly one "
		"\"deopt\" operand bundle");
		Assert(CS.getType() == CS.getInstruction()->getFunction()->getReturnType(),
		"experimental_deoptimize return type must match caller return type");

		if (CS.isCall()) {
		auto *DeoptCI = CS.getInstruction();
		auto *RI = dyn_cast<ReturnInst>(DeoptCI->getNextNode());
		Assert(RI,
		"calls to experimental_deoptimize must be followed by a return");

		if (!CS.getType()->isVoidTy() && RI)
		Assert(RI->getReturnValue() == DeoptCI,
		"calls to experimental_deoptimize must be followed by a return "
		"of the value computed by experimental_deoptimize");
		}

		break;
		}
};		};
}		}

/// \brief Carefully grab the subprogram from a local scope.		/// \brief Carefully grab the subprogram from a local scope.
///		///
/// This carefully grabs the subprogram from a local scope, avoiding the		/// This carefully grabs the subprogram from a local scope, avoiding the
/// built-in assertions that would typically fire.		/// built-in assertions that would typically fire.
static DISubprogram getSubprogram(Metadata LocalScope) {		static DISubprogram getSubprogram(Metadata LocalScope) {
▲ Show 20 Lines • Show All 261 Lines • Show Last 20 Lines

lib/Transforms/Utils/InlineFunction.cpp

Show First 20 Lines • Show All 1,607 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator I = FirstNewBlock->begin(),
InsertPoint, FirstNewBlock->getInstList(), AI->getIterator(), I);		InsertPoint, FirstNewBlock->getInstList(), AI->getIterator(), I);
}		}
// Move any dbg.declares describing the allocas into the entry basic block.		// Move any dbg.declares describing the allocas into the entry basic block.
DIBuilder DIB(*Caller->getParent());		DIBuilder DIB(*Caller->getParent());
for (auto &AI : IFI.StaticAllocas)		for (auto &AI : IFI.StaticAllocas)
replaceDbgDeclareForAlloca(AI, AI, DIB, /Deref=/false);		replaceDbgDeclareForAlloca(AI, AI, DIB, /Deref=/false);
}		}

bool InlinedMustTailCalls = false;		bool InlinedMustTailCalls = false, InlinedDeoptimizeCalls = false;
if (InlinedFunctionInfo.ContainsCalls) {		if (InlinedFunctionInfo.ContainsCalls) {
CallInst::TailCallKind CallSiteTailKind = CallInst::TCK_None;		CallInst::TailCallKind CallSiteTailKind = CallInst::TCK_None;
if (CallInst *CI = dyn_cast<CallInst>(TheCall))		if (CallInst *CI = dyn_cast<CallInst>(TheCall))
CallSiteTailKind = CI->getTailCallKind();		CallSiteTailKind = CI->getTailCallKind();

for (Function::iterator BB = FirstNewBlock, E = Caller->end(); BB != E;		for (Function::iterator BB = FirstNewBlock, E = Caller->end(); BB != E;
++BB) {		++BB) {
for (Instruction &I : *BB) {		for (Instruction &I : *BB) {
CallInst *CI = dyn_cast<CallInst>(&I);		CallInst *CI = dyn_cast<CallInst>(&I);
if (!CI)		if (!CI)
continue;		continue;

		if (Function *F = CI->getCalledFunction())
		InlinedDeoptimizeCalls \|=
		F->getIntrinsicID() == Intrinsic::experimental_deoptimize;

// We need to reduce the strength of any inlined tail calls. For		// We need to reduce the strength of any inlined tail calls. For
// musttail, we have to avoid introducing potential unbounded stack		// musttail, we have to avoid introducing potential unbounded stack
// growth. For example, if functions 'f' and 'g' are mutually recursive		// growth. For example, if functions 'f' and 'g' are mutually recursive
// with musttail, we can inline 'g' into 'f' so long as we preserve		// with musttail, we can inline 'g' into 'f' so long as we preserve
// musttail on the cloned call to 'f'. If either the inlined call site		// musttail on the cloned call to 'f'. If either the inlined call site
// or the cloned call site is not musttail, the program already has		// or the cloned call site is not musttail, the program already has
// one frame of stack growth, so it's safe to remove musttail. Here is		// one frame of stack growth, so it's safe to remove musttail. Here is
// a table of example transformations:		// a table of example transformations:
▲ Show 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	for (Function::iterator BB = FirstNewBlock->getIterator(),
} else {		} else {
auto *FPI = cast<FuncletPadInst>(I);		auto *FPI = cast<FuncletPadInst>(I);
if (isa<ConstantTokenNone>(FPI->getParentPad()))		if (isa<ConstantTokenNone>(FPI->getParentPad()))
FPI->setParentPad(CallSiteEHPad);		FPI->setParentPad(CallSiteEHPad);
}		}
}		}
}		}

		if (InlinedDeoptimizeCalls) {
		reamesUnsubmitted Done Reply Inline Actions Can't you just skip the entire region if the return type of the caller is the same as the callee? Actually, no, you need to remove them from the normal returns so that we early terminate the parent as well. Can you restructure/add comments to make both parts clear? reames: Can't you just skip the entire region if the return type of the caller is the same as the…
		Function *NewDeoptIntrinsic = nullptr;
		if (Caller->getReturnType() != TheCall->getType())
		NewDeoptIntrinsic = Intrinsic::getDeclaration(
		Caller->getParent(), Intrinsic::experimental_deoptimize,
		{Caller->getReturnType()});

		SmallVector<ReturnInst *, 8> NormalReturns;
		for (ReturnInst *RI : Returns) {
		reamesUnsubmitted Not Done Reply Inline Actions minor: the implicit bool conversion here is confusing. nullptr != X would be clearer. Alternatively, have the lambda explicitly return bool. reames: minor: the implicit bool conversion here is confusing. nullptr != X would be clearer.
		sanjoyAuthorUnsubmitted Done Reply Inline Actions SGTM, will fix shortly. sanjoy: SGTM, will fix shortly.
		CallInst *DeoptCall = RI->getParent()->getTerminatingDeoptimizeCall();
		if (!DeoptCall) {
		NormalReturns.push_back(RI);
		continue;
		}
		if (!NewDeoptIntrinsic)
		continue;

		auto *CurBB = RI->getParent();
		RI->eraseFromParent();

		SmallVector<OperandBundleDef, 1> OpBundles;
		DeoptCall->getOperandBundlesAsDefs(OpBundles);
		DeoptCall->eraseFromParent();
		assert(!OpBundles.empty() &&
		reamesUnsubmitted Not Done Reply Inline Actions I'd intended in my previous comment to indicate that the intrinsic should take an unlimited number of untyped arguments. Not one untyped argument. reames: I'd intended in my previous comment to indicate that the intrinsic should take an unlimited…
		sanjoyAuthorUnsubmitted Not Done Reply Inline Actions I thought about that -- having a varargs signature is not more general (since you can just create an aggregate type / alloca and pass that in), and did not feel like it was worth the complexity (e.g. we'd probably not want the call to be a varargs call at the ABI level). sanjoy: I thought about that -- having a varargs signature is not more general (since you can just…
		reamesUnsubmitted Not Done Reply Inline Actions Hm, I disagree with you, but let's separate this out. For this review, let's use a form with no arguments. We can extend it later to support the multiple argument form if desired. That can be done as a follow up change. (I specifically don't want the one argument form going in because it requires one argument when a completely legitimate runtime might not need any.) reames: Hm, I disagree with you, but let's separate this out. For this review, let's use a form with…
		"Expected at least the deopt operand bundle");

		IRBuilder<> Builder(CurBB);
		Value *NewDeoptCall =
		Builder.CreateCall(NewDeoptIntrinsic, {}, OpBundles);
		rnkUnsubmitted Done Reply Inline Actions Maybe do this outside the loop to save some lookups. rnk: Maybe do this outside the loop to save some lookups.
		if (NewDeoptCall->getType()->isVoidTy())
		Builder.CreateRetVoid();
		else
		Builder.CreateRet(NewDeoptCall);
		}

		// Leave behind the normal returns so we can merge control flow.
		std::swap(Returns, NormalReturns);
		}

// Handle any inlined musttail call sites. In order for a new call site to be		// Handle any inlined musttail call sites. In order for a new call site to be
// musttail, the source of the clone and the inlined call site must have been		// musttail, the source of the clone and the inlined call site must have been
// musttail. Therefore it's safe to return without merging control into the		// musttail. Therefore it's safe to return without merging control into the
// phi below.		// phi below.
if (InlinedMustTailCalls) {		if (InlinedMustTailCalls) {
// Check if we need to bitcast the result of any musttail calls.		// Check if we need to bitcast the result of any musttail calls.
Type *NewRetTy = Caller->getReturnType();		Type *NewRetTy = Caller->getReturnType();
bool NeedBitCast = !TheCall->use_empty() && TheCall->getType() != NewRetTy;		bool NeedBitCast = !TheCall->use_empty() && TheCall->getType() != NewRetTy;
▲ Show 20 Lines • Show All 220 Lines • Show Last 20 Lines

test/Transforms/Inline/deoptimize-intrinsic.ll

This file was added.

				; RUN: opt -S -always-inline < %s \| FileCheck %s

				declare i8 @llvm.experimental.deoptimize.i8()

				define i8 @callee(i1 %c) alwaysinline {
				br i1 %c, label %left, label %right

				left:
				%v = call i8 @llvm.experimental.deoptimize.i8() [ "deopt"(i32 1) ]
				ret i8 %v

				right:
				ret i8 0
				}

				define void @caller_0(i1 %c, i8* %ptr) {
				; CHECK-LABEL: @caller_0(
				entry:
				%v = call i8 @callee(i1 %c) [ "deopt"(i32 2) ]
				; CHECK: left.i:
				; CHECK-NEXT: call void @llvm.experimental.deoptimize.isVoid() [ "deopt"(i32 2, i32 1) ]
				; CHECK-NEXT: ret void

				store i8 %v, i8* %ptr
				ret void
				}

				define i32 @caller_1(i1 %c, i8* %ptr) {
				; CHECK-LABEL: @caller_1(
				entry:
				%v = call i8 @callee(i1 %c) [ "deopt"(i32 3) ]
				; CHECK: left.i:
				; CHECK-NEXT: %0 = call i32 @llvm.experimental.deoptimize.i32() [ "deopt"(i32 3, i32 1) ]
				; CHECK-NEXT: ret i32 %0

				store i8 %v, i8* %ptr
				ret i32 42
				}

test/Verifier/deoptimize-intrinsic.ll

This file was added.

				; RUN: not opt -verify < %s 2>&1 \| FileCheck %s

				declare i8 @llvm.experimental.deoptimize.i8()
				declare void @llvm.experimental.deoptimize.isVoid()

				declare void @unknown()

				define void @f_notail() {
				entry:
				call void @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
				; CHECK: calls to experimental_deoptimize must be followed by a return
				call void @unknown()
				ret void
				}

				define void @f_nodeopt() {
				entry:
				call void @llvm.experimental.deoptimize.isVoid()
				; CHECK: experimental_deoptimize must have exactly one "deopt" operand bundle
				ret void
				}

				define void @f_invoke() personality i8 3 {
				entry:
				invoke void @llvm.experimental.deoptimize.isVoid() to label %ok unwind label %not_ok
				; CHECK: experimental_deoptimize cannot be invoked

				ok:
				ret void

				not_ok:
				%0 = landingpad { i8*, i32 }
				filter [0 x i8*] zeroinitializer
				ret void
				}

				define i8 @f_incorrect_return() {
				entry:
				%val = call i8 @llvm.experimental.deoptimize.i8() [ "deopt"() ]
				; CHECK: calls to experimental_deoptimize must be followed by a return of the value computed by experimental_deoptimize
				ret i8 0
				}