This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
docs/
-
LangRef.rst
-
include/llvm/
-
llvm/
-
ADT/
-
STLExtras.h
-
IR/
-
BasicBlock.h
-
CallSite.h
-
Intrinsics.td
-
lib/
-
IR/
-
BasicBlock.cpp
-
Verifier.cpp
-
Transforms/Utils/
-
Utils/
-
InlineFunction.cpp
-
test/
-
Transforms/Inline/
-
Inline/
-
deoptimize-intrinsic.ll
-
Verifier/
-
deoptimize-intrinsic.ll

Differential D17732

Introduce @llvm.experimental.deoptimize
ClosedPublic

Authored by sanjoy on Feb 29 2016, 1:43 PM.

Download Raw Diff

Details

Reviewers

chandlerc
reames
atrick
rnk

Commits

rGb51325dbdb10: Introduce @llvm.experimental.deoptimize
rL263281: Introduce @llvm.experimental.deoptimize

Summary

This intrinsic, together with deoptimization operand bundles, allow
frontends to express transfer of control and frame-local state from
one (typically more specialized, hence faster) version of a function
into another (typically more generic, hence slower) version.

In languages with a fully integrated managed runtime this intrinsic can
be used to implement "uncommon trap" like functionality. In unmanaged
languages like C and C++, this intrinsic can be used to represent the
slow paths of specialized functions.

Diff Detail

Repository: rL LLVM

Event Timeline

sanjoy updated this revision to Diff 49412.Feb 29 2016, 1:43 PM

sanjoy retitled this revision from to Introduce @llvm.experimental.deoptimize.

sanjoy updated this object.

sanjoy added reviewers: atrick, chandlerc, rnk, reames.

sanjoy added subscribers: JosephTremoulet, maksfb, mjacob and 2 others.

Herald added a subscriber: mcrosier. · View Herald TranscriptFeb 29 2016, 1:43 PM

chandlerc added inline comments.Feb 29 2016, 1:54 PM

docs/LangRef.rst
12147–12148 ↗	(On Diff #49412)	Would it be more precise to say that it must be a 'musttail' call?

sanjoy added inline comments.Feb 29 2016, 2:19 PM

docs/LangRef.rst
12147–12148 ↗	(On Diff #49412)	At the IR level, semantically, it does "look like" a `musttail` call, but at the ABI level it is not a `musttail` call from LLVM's perspective. Usually the caller frame will have state that is needed to transition into the deopt continuation, and so it is not okay for LLVM to lower this call as "pop frame and jump". This is why I'm hesitant to specify that calls to experimental.deoptimize have to be `musttail`. On the other hand, managed languages typically do have a guarantee that once we've fully transitioned into the generic / de-specialized function, the caller frame will have been fully popped; so it is a lot like a `musttail` call from the "eventual stack growth" perspective. A second minor difference is that the restrictions are a little stricter -- because the intrinsic is polymorphic on the return value, there is no need to allow the optional `bitcast` (this part is fairly inconsequential though). So the summary is that I'm somewhat on the fence on making this a `musttail` call (or not), and given that changing our mind later would not involve a lot of interesting code-churn, I'd like to defer that decision to a later point. However if either of the points in the first paragraph strongly resonate with you please let me know.

ping?

It looks like this is missing codegen support. For now, I'd suggest lowering to call to a known symbol. __llvm_deoptimize?

docs/LangRef.rst
12107 ↗	(On Diff #49412)	Naming wise, I liked the term side exit much better.
12115 ↗	(On Diff #49412)	Might be better to declare this as taking unspecified arguments which are interpreted by the runtime.
12124 ↗	(On Diff #49412)	As written, this explanation is too generic. The intrinsic specifically only allows transfer of control in one direction - leaving the current code. It does not allow entries. Having a separate intrinsic which modelled OSR entry points might some day be useful, but that's a different problem.
12126 ↗	(On Diff #49412)	Java, or JavaScript
12127 ↗	(On Diff #49412)	or "side exit"
12143 ↗	(On Diff #49412)	Might be good to say something about the interpretation of deopt continuation as being explicitly out of scope for LLVM. (i.e. go invoke random code which must...)
lib/Transforms/Utils/InlineFunction.cpp
1806 ↗	(On Diff #49412)	Can't you just skip the entire region if the return type of the caller is the same as the callee? Actually, no, you need to remove them from the normal returns so that we early terminate the parent as well. Can you restructure/add comments to make both parts clear?

I was also confused by the missing codegen support. Otherwise looks fine to me. I'll let Philip decide when it's ready to accept.

sanjoy updated this object.Mar 7 2016, 3:53 PM

sanjoy edited edge metadata.

Add a lowering strategy to FastISel and SelectionDAG -- the intrinsic is lowered to __llvm_deoptimize.
The intrinsic now takes an argument of unspecified type. The contract of what the argument means and how __llvm_deoptimize interprets it is open-ended and up to the frontend.

Herald added a subscriber: MatzeB. · View Herald TranscriptMar 7 2016, 3:54 PM

sanjoy added inline comments.Mar 7 2016, 3:55 PM

docs/LangRef.rst
12107 ↗	(On Diff #49412)	I'm probably going to rename `@guard_on` as `@guarded_side_exit`.

My previous documentation comments have not been addressed.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5447 ↗	(On Diff #50005)	possibly we should have experimental in the name? __llvm_experimental_deoptimize?
lib/Transforms/Utils/InlineFunction.cpp
1814 ↗	(On Diff #50005)	minor: the implicit bool conversion here is confusing. nullptr != X would be clearer. Alternatively, have the lambda explicitly return bool.
1829 ↗	(On Diff #50005)	I'd intended in my previous comment to indicate that the intrinsic should take an unlimited number of untyped arguments. Not one untyped argument.

This revision now requires changes to proceed.Mar 9 2016, 12:10 PM

Re: doc comments -- I think did address all of them, can you point out the ones that you are not okay with in the current form?

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5447 ↗	(On Diff #50005)	Good idea, will fix.
lib/Transforms/Utils/InlineFunction.cpp
1814 ↗	(On Diff #50005)	SGTM, will fix shortly.
1829 ↗	(On Diff #50005)	I thought about that -- having a varargs signature is not more general (since you can just create an aggregate type / alloca and pass that in), and did not feel like it was worth the complexity (e.g. we'd probably not want the call to be a varargs call at the ABI level).

In D17732#371098, @sanjoy wrote:

Re: doc comments -- I think did address all of them, can you point out the ones that you are not okay with in the current form?

You're correct. For some reason the doc changes didn't show up in the diff between versions (or I just miss them). All of the doc comments have been addressed satisfactorily.

With the current comments addressed, LGTM.

lib/Transforms/Utils/InlineFunction.cpp
1829 ↗	(On Diff #50005)	Hm, I disagree with you, but let's separate this out. For this review, let's use a form with no arguments. We can extend it later to support the multiple argument form if desired. That can be done as a follow up change. (I specifically don't want the one argument form going in because it requires one argument when a completely legitimate runtime might not need any.)

This revision is now accepted and ready to land.Mar 9 2016, 12:36 PM

The lowering I'm doing here is basically incorrect. The lowering should be going through RS4GC / gc.statepoint (since that is the only way LLC can understand deopt state). There is also a missing assert in SelectionDAG and FastISel -- at this point they do not know how to lower operand bundles, and should assert if they do see a call with operand bundle (had this assert been there I'd not have misled myself).

Also, given that we'll go through RS4GC, it makes the "lower varargs to regular args" bit fairly easy.

Given that the lowering bits will be a little more complex now, what do you think about the following plan:

Rip out the lowering parts from this change
Change the spec to have deoptimize take varargs
Introduce lowering via RS4GC in a separate change

(I'll also add some assertions to SelectionDAG and FastISel to fail on calls with operand bundles, but that's not directly relevant here)

Sanjoy and I talked about this offline; let me summarize what we talked about.

I expressed discomfort with reusing RS4GC as Sanjoy suggested. While this would seem to be the path of least resistance, it seemed odd to me to tie deopt and GC so closely coupled together. We clearly need to reuse the backend representation (i.e. the STATEPOINT pseudo op) since we need the stack lowering and stack map entry pieces, but reusing the IR rewriting/lowering code seemed overly coupled.

After discussion, we settled on the following approach. Sanjoy is going to examine whether we could reuse the StatepointLowering.cpp code directly from SelectionDAGBuilder's intrinsic lowering code. This would imply that we don't need to wrap a call to @llvm.experimental.deoptimize in an statepoint to get it lowered to a STATEPOINT psuedo op. It also gives us the building block for a possible future change which removes the deopt arguments from the statepoint intrinsic entirely in favour of a deopt bundle attached to the call to the statepoint.

(Worth noting, we could have used a PATCHPOINT psuedo op and not a STATEPOINT one for the lowering of the deoptimize call. The difference is live-on-call/in-regs vs live-through-call/on-stack semantics for the deopt arguments. We decided to use the STATEPOINT pseudo op mostly because that's what we have more experience with. We may revisit this lowering detail in the future, but it wasn't worth the churn right now.)

p.s. I'm deferring to Sanjoy on whether he wants to do everything within this patch or separate the lowering into it's own patch. He'll make that decision and potentially land this patch without the lowering code or argument handling. I've already given an LGTM for that part and that still stands.

This revision

Fixes the nits pointed out earlier
Removes lowering code (that will be addressed later, as Philip mentioned)
Changes the signature to varargs
Adds support for calls to @llvm.experimental.deoptimize in a function being invoke d, and relevant test cases

The last bit is a semantic change that hasn't been LGTM'ed or
discussed previously; hence this re-review.

ping!

The additional handling for inlining invokes to functions that contain deoptimize calls is fairly straightforward, it'd be great to get a quick re-LGTM on this.

I looked at the inliner stuff, and it looks good to me.

docs/LangRef.rst
12147–12148 ↗	(On Diff #50215)	I don't think it would be useful to mark these as `musttail`. You will end up fighting with the existing musttail logic, which still merges returns after a tail call into normal control flow.
lib/Transforms/Utils/InlineFunction.cpp
1841–1843 ↗	(On Diff #50215)	Maybe do this outside the loop to save some lookups.

Closed by commit rL263281: Introduce @llvm.experimental.deoptimize (authored by sanjoy). · Explain WhyMar 11 2016, 11:13 AM

This revision was automatically updated to reflect the committed changes.

sanjoy marked an inline comment as done.

Revision Contents

Path

Size

llvm/

trunk/

docs/

LangRef.rst

70 lines

include/

llvm/

ADT/

STLExtras.h

7 lines

IR/

BasicBlock.h

8 lines

CallSite.h

4 lines

Intrinsics.td

4 lines

lib/

IR/

BasicBlock.cpp

15 lines

Verifier.cpp

23 lines

Transforms/

Utils/

InlineFunction.cpp

65 lines

test/

Transforms/

Inline/

deoptimize-intrinsic.ll

90 lines

Verifier/

deoptimize-intrinsic.ll

42 lines

Diff 50452

llvm/trunk/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,528 Lines • ▼ Show 20 Lines	- Calls and invokes with operand bundles have unknown read / write
``readnone`` or ``readonly``), unless they're overridden with		``readnone`` or ``readonly``), unless they're overridden with
callsite specific attributes.		callsite specific attributes.
- An operand bundle at a call site cannot change the implementation		- An operand bundle at a call site cannot change the implementation
of the called function. Inter-procedural optimizations work as		of the called function. Inter-procedural optimizations work as
usual as long as they take into account the first two properties.		usual as long as they take into account the first two properties.

More specific types of operand bundles are described below.		More specific types of operand bundles are described below.

		.. _deopt_opbundles:

Deoptimization Operand Bundles		Deoptimization Operand Bundles
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Deoptimization operand bundles are characterized by the ``"deopt"``		Deoptimization operand bundles are characterized by the ``"deopt"``
operand bundle tag. These operand bundles represent an alternate		operand bundle tag. These operand bundles represent an alternate
"safe" continuation for the call site they're attached to, and can be		"safe" continuation for the call site they're attached to, and can be
used by a suitable runtime to deoptimize the compiled frame at the		used by a suitable runtime to deoptimize the compiled frame at the
specified call site. There can be at most one ``"deopt"`` operand		specified call site. There can be at most one ``"deopt"`` operand
▲ Show 20 Lines • Show All 10,552 Lines • ▼ Show 20 Lines
None.		None.

Semantics:		Semantics:
""""""""""		""""""""""

This intrinsic does nothing, and it's removed by optimizers and ignored		This intrinsic does nothing, and it's removed by optimizers and ignored
by codegen.		by codegen.

		'``llvm.experimental.deoptimize``' Intrinsic
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""

		::

		declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]

		Overview:
		"""""""""

		This intrinsic, together with :ref:`deoptimization operand bundles
		<deopt_opbundles>`, allow frontends to express transfer of control and
		frame-local state from the currently executing (typically more specialized,
		hence faster) version of a function into another (typically more generic, hence
		slower) version.

		In languages with a fully integrated managed runtime like Java and JavaScript
		this intrinsic can be used to implement "uncommon trap" or "side exit" like
		functionality. In unmanaged languages like C and C++, this intrinsic can be
		used to represent the slow paths of specialized functions.


		Arguments:
		""""""""""

		The intrinsic takes an arbitrary number of arguments, whose meaning is
		decided by the :ref:`lowering strategy<deoptimize_lowering>`.

		Semantics:
		""""""""""

		The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
		deoptimization continuation (denoted using a :ref:`deoptimization
		operand bundle <deopt_opbundles>`) and returns the value returned by
		the deoptimization continuation. Defining the semantic properties of
		the continuation itself is out of scope of the language reference --
		as far as LLVM is concerned, the deoptimization continuation can
		invoke arbitrary side effects, including reading from and writing to
		the entire heap.

		Deoptimization continuations expressed using ``"deopt"`` operand bundles always
		continue execution to the end of the physical frame containing them, so all
		calls to ``@llvm.experimental.deoptimize`` must be in "tail position":

		- ``@llvm.experimental.deoptimize`` cannot be invoked.
		- The call must immediately precede a :ref:`ret <i_ret>` instruction.
		- The ``ret`` instruction must return the value produced by the
		``@llvm.experimental.deoptimize`` call if there is one, or void.

		Note that the above restrictions imply that the return type for a call to
		``@llvm.experimental.deoptimize`` will match the return type of its immediate
		caller.

		The inliner composes the ``"deopt"`` continuations of the caller into the
		``"deopt"`` continuations present in the inlinee, and also updates calls to this
		intrinsic to return directly from the frame of the function it inlined into.

		.. _deoptimize_lowering:

		Lowering:
		"""""""""

		Lowering for ``@llvm.experimental.deoptimize`` is not yet implemented,
		and is a work in progress.

Stack Map Intrinsics		Stack Map Intrinsics
--------------------		--------------------

LLVM provides experimental intrinsics to support runtime patching		LLVM provides experimental intrinsics to support runtime patching
mechanisms commonly desired in dynamic language JITs. These intrinsics		mechanisms commonly desired in dynamic language JITs. These intrinsics
are described in :doc:`StackMaps`.		are described in :doc:`StackMaps`.

llvm/trunk/include/llvm/ADT/STLExtras.h

	Show First 20 Lines • Show All 387 Lines • ▼ Show 20 Lines

	/// Provide wrappers to std::find_if which take ranges instead of having to pass			/// Provide wrappers to std::find_if which take ranges instead of having to pass
	/// begin/end explicitly.			/// begin/end explicitly.
	template <typename R, class T>			template <typename R, class T>
	auto find_if(R &&Range, const T &Pred) -> decltype(Range.begin()) {			auto find_if(R &&Range, const T &Pred) -> decltype(Range.begin()) {
	return std::find_if(Range.begin(), Range.end(), Pred);			return std::find_if(Range.begin(), Range.end(), Pred);
	}			}

				/// Provide wrappers to std::remove_if which take ranges instead of having to
				/// pass begin/end explicitly.
				template<typename R, class UnaryPredicate>
				auto remove_if(R &&Range, UnaryPredicate &&P) -> decltype(Range.begin()) {
				return std::remove_if(Range.begin(), Range.end(), P);
				}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Extra additions to <memory>			// Extra additions to <memory>
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// Implement make_unique according to N3656.			// Implement make_unique according to N3656.

	/// \brief Constructs a `new T()` with the given args and returns a			/// \brief Constructs a `new T()` with the given args and returns a
	/// `unique_ptr<T>` which owns the object.			/// `unique_ptr<T>` which owns the object.
	▲ Show 20 Lines • Show All 76 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/IR/BasicBlock.h

Show First 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	public:
const Module *getModule() const;		const Module *getModule() const;
Module *getModule();		Module *getModule();

/// \brief Returns the terminator instruction if the block is well formed or		/// \brief Returns the terminator instruction if the block is well formed or
/// null if the block is not well formed.		/// null if the block is not well formed.
TerminatorInst *getTerminator();		TerminatorInst *getTerminator();
const TerminatorInst *getTerminator() const;		const TerminatorInst *getTerminator() const;

		/// \brief Returns the call instruction calling @llvm.experimental.deoptimize
		/// prior to the terminating return instruction of this basic block, if such a
		/// call is present. Otherwise, returns null.
		CallInst *getTerminatingDeoptimizeCall();
		const CallInst *getTerminatingDeoptimizeCall() const {
		return const_cast<BasicBlock *>(this)->getTerminatingDeoptimizeCall();
		}

/// \brief Returns the call instruction marked 'musttail' prior to the		/// \brief Returns the call instruction marked 'musttail' prior to the
/// terminating return instruction of this basic block, if such a call is		/// terminating return instruction of this basic block, if such a call is
/// present. Otherwise, returns null.		/// present. Otherwise, returns null.
CallInst *getTerminatingMustTailCall();		CallInst *getTerminatingMustTailCall();
const CallInst *getTerminatingMustTailCall() const {		const CallInst *getTerminatingMustTailCall() const {
return const_cast<BasicBlock *>(this)->getTerminatingMustTailCall();		return const_cast<BasicBlock *>(this)->getTerminatingMustTailCall();
}		}

▲ Show 20 Lines • Show All 220 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/IR/CallSite.h

Show First 20 Lines • Show All 447 Lines • ▼ Show 20 Lines	#define CALLSITE_DELEGATE_SETTER(METHOD) \
Optional<OperandBundleUse> getOperandBundle(StringRef Name) const {		Optional<OperandBundleUse> getOperandBundle(StringRef Name) const {
CALLSITE_DELEGATE_GETTER(getOperandBundle(Name));		CALLSITE_DELEGATE_GETTER(getOperandBundle(Name));
}		}

Optional<OperandBundleUse> getOperandBundle(uint32_t ID) const {		Optional<OperandBundleUse> getOperandBundle(uint32_t ID) const {
CALLSITE_DELEGATE_GETTER(getOperandBundle(ID));		CALLSITE_DELEGATE_GETTER(getOperandBundle(ID));
}		}

		unsigned countOperandBundlesOfType(uint32_t ID) const {
		CALLSITE_DELEGATE_GETTER(countOperandBundlesOfType(ID));
		}

IterTy arg_begin() const {		IterTy arg_begin() const {
CALLSITE_DELEGATE_GETTER(arg_begin());		CALLSITE_DELEGATE_GETTER(arg_begin());
}		}

IterTy arg_end() const {		IterTy arg_end() const {
CALLSITE_DELEGATE_GETTER(arg_end());		CALLSITE_DELEGATE_GETTER(arg_end());
}		}

▲ Show 20 Lines • Show All 116 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 587 Lines • ▼ Show 20 Lines
	//			//
	def int_flt_rounds : Intrinsic<[llvm_i32_ty]>,			def int_flt_rounds : Intrinsic<[llvm_i32_ty]>,
	GCCBuiltin<"__builtin_flt_rounds">;			GCCBuiltin<"__builtin_flt_rounds">;
	def int_trap : Intrinsic<[], [], [IntrNoReturn]>,			def int_trap : Intrinsic<[], [], [IntrNoReturn]>,
	GCCBuiltin<"__builtin_trap">;			GCCBuiltin<"__builtin_trap">;
	def int_debugtrap : Intrinsic<[]>,			def int_debugtrap : Intrinsic<[]>,
	GCCBuiltin<"__builtin_debugtrap">;			GCCBuiltin<"__builtin_debugtrap">;

				// Support for dynamic deoptimization (or de-specialization)
				def int_experimental_deoptimize : Intrinsic<[llvm_any_ty], [llvm_vararg_ty],
				[Throws]>;

	// NOP: calls/invokes to this intrinsic are removed by codegen			// NOP: calls/invokes to this intrinsic are removed by codegen
	def int_donothing : Intrinsic<[], [], [IntrNoMem]>;			def int_donothing : Intrinsic<[], [], [IntrNoMem]>;

	// Intrisics to support half precision floating point format			// Intrisics to support half precision floating point format
	let IntrProperties = [IntrNoMem] in {			let IntrProperties = [IntrNoMem] in {
	def int_convert_to_fp16 : Intrinsic<[llvm_i16_ty], [llvm_anyfloat_ty]>;			def int_convert_to_fp16 : Intrinsic<[llvm_i16_ty], [llvm_anyfloat_ty]>;
	def int_convert_from_fp16 : Intrinsic<[llvm_anyfloat_ty], [llvm_i16_ty]>;			def int_convert_from_fp16 : Intrinsic<[llvm_anyfloat_ty], [llvm_i16_ty]>;
	}			}
	▲ Show 20 Lines • Show All 74 Lines • Show Last 20 Lines

llvm/trunk/lib/IR/BasicBlock.cpp

Show First 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	CallInst *BasicBlock::getTerminatingMustTailCall() {

if (auto *CI = dyn_cast<CallInst>(Prev)) {		if (auto *CI = dyn_cast<CallInst>(Prev)) {
if (CI->isMustTailCall())		if (CI->isMustTailCall())
return CI;		return CI;
}		}
return nullptr;		return nullptr;
}		}

		CallInst *BasicBlock::getTerminatingDeoptimizeCall() {
		if (InstList.empty())
		return nullptr;
		auto *RI = dyn_cast<ReturnInst>(&InstList.back());
		if (!RI \|\| RI == &InstList.front())
		return nullptr;

		if (auto *CI = dyn_cast_or_null<CallInst>(RI->getPrevNode()))
		if (Function *F = CI->getCalledFunction())
		if (F->getIntrinsicID() == Intrinsic::experimental_deoptimize)
		return CI;

		return nullptr;
		}

Instruction* BasicBlock::getFirstNonPHI() {		Instruction* BasicBlock::getFirstNonPHI() {
for (Instruction &I : *this)		for (Instruction &I : *this)
if (!isa<PHINode>(I))		if (!isa<PHINode>(I))
return &I;		return &I;
return nullptr;		return nullptr;
}		}

Instruction* BasicBlock::getFirstNonPHIOrDbg() {		Instruction* BasicBlock::getFirstNonPHIOrDbg() {
▲ Show 20 Lines • Show All 259 Lines • Show Last 20 Lines

llvm/trunk/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 4,076 Lines • ▼ Show 20 Lines	case Intrinsic::masked_store: {
Type *DataTy = cast<PointerType>(Ptr->getType())->getElementType();		Type *DataTy = cast<PointerType>(Ptr->getType())->getElementType();
Assert(DataTy == Val->getType(),		Assert(DataTy == Val->getType(),
"masked_store: storee must match pointer type", CS);		"masked_store: storee must match pointer type", CS);
Assert(Mask->getType()->getVectorNumElements() ==		Assert(Mask->getType()->getVectorNumElements() ==
DataTy->getVectorNumElements(),		DataTy->getVectorNumElements(),
"masked_store: vector mask must be same length as data", CS);		"masked_store: vector mask must be same length as data", CS);
break;		break;
}		}

		case Intrinsic::experimental_deoptimize: {
		Assert(CS.isCall(), "experimental_deoptimize cannot be invoked", CS);
		Assert(CS.countOperandBundlesOfType(LLVMContext::OB_deopt) == 1,
		"experimental_deoptimize must have exactly one "
		"\"deopt\" operand bundle");
		Assert(CS.getType() == CS.getInstruction()->getFunction()->getReturnType(),
		"experimental_deoptimize return type must match caller return type");

		if (CS.isCall()) {
		auto *DeoptCI = CS.getInstruction();
		auto *RI = dyn_cast<ReturnInst>(DeoptCI->getNextNode());
		Assert(RI,
		"calls to experimental_deoptimize must be followed by a return");

		if (!CS.getType()->isVoidTy() && RI)
		Assert(RI->getReturnValue() == DeoptCI,
		"calls to experimental_deoptimize must be followed by a return "
		"of the value computed by experimental_deoptimize");
		}

		break;
		}
};		};
}		}

/// \brief Carefully grab the subprogram from a local scope.		/// \brief Carefully grab the subprogram from a local scope.
///		///
/// This carefully grabs the subprogram from a local scope, avoiding the		/// This carefully grabs the subprogram from a local scope, avoiding the
/// built-in assertions that would typically fire.		/// built-in assertions that would typically fire.
static DISubprogram getSubprogram(Metadata LocalScope) {		static DISubprogram getSubprogram(Metadata LocalScope) {
▲ Show 20 Lines • Show All 261 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp

Show First 20 Lines • Show All 421 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator BBI = BB->begin(), E = BB->end(); BBI != E; ) {

// We only need to check for function calls: inlined invoke		// We only need to check for function calls: inlined invoke
// instructions require no special handling.		// instructions require no special handling.
CallInst *CI = dyn_cast<CallInst>(I);		CallInst *CI = dyn_cast<CallInst>(I);

if (!CI \|\| CI->doesNotThrow() \|\| isa<InlineAsm>(CI->getCalledValue()))		if (!CI \|\| CI->doesNotThrow() \|\| isa<InlineAsm>(CI->getCalledValue()))
continue;		continue;

		// We do not need to (and in fact, cannot) convert possibly throwing calls
		// to @llvm.experimental_deoptimize into invokes. The caller's "segment" of
		// the deoptimization continuation attached to the newly inlined
		// @llvm.experimental_deoptimize call should contain the exception handling
		// logic, if any.
		if (auto *F = CI->getCalledFunction())
		if (F->getIntrinsicID() == Intrinsic::experimental_deoptimize)
		continue;

if (auto FuncletBundle = CI->getOperandBundle(LLVMContext::OB_funclet)) {		if (auto FuncletBundle = CI->getOperandBundle(LLVMContext::OB_funclet)) {
// This call is nested inside a funclet. If that funclet has an unwind		// This call is nested inside a funclet. If that funclet has an unwind
// destination within the inlinee, then unwinding out of this call would		// destination within the inlinee, then unwinding out of this call would
// be UB. Rewriting this call to an invoke which targets the inlined		// be UB. Rewriting this call to an invoke which targets the inlined
// invoke's unwind dest would give the call's parent funclet multiple		// invoke's unwind dest would give the call's parent funclet multiple
// unwind destinations, which is something that subsequent EH table		// unwind destinations, which is something that subsequent EH table
// generation can't handle and that the veirifer rejects. So when we		// generation can't handle and that the veirifer rejects. So when we
// see such a call, leave it as a call.		// see such a call, leave it as a call.
▲ Show 20 Lines • Show All 1,170 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator I = FirstNewBlock->begin(),
InsertPoint, FirstNewBlock->getInstList(), AI->getIterator(), I);		InsertPoint, FirstNewBlock->getInstList(), AI->getIterator(), I);
}		}
// Move any dbg.declares describing the allocas into the entry basic block.		// Move any dbg.declares describing the allocas into the entry basic block.
DIBuilder DIB(*Caller->getParent());		DIBuilder DIB(*Caller->getParent());
for (auto &AI : IFI.StaticAllocas)		for (auto &AI : IFI.StaticAllocas)
replaceDbgDeclareForAlloca(AI, AI, DIB, /Deref=/false);		replaceDbgDeclareForAlloca(AI, AI, DIB, /Deref=/false);
}		}

bool InlinedMustTailCalls = false;		bool InlinedMustTailCalls = false, InlinedDeoptimizeCalls = false;
if (InlinedFunctionInfo.ContainsCalls) {		if (InlinedFunctionInfo.ContainsCalls) {
CallInst::TailCallKind CallSiteTailKind = CallInst::TCK_None;		CallInst::TailCallKind CallSiteTailKind = CallInst::TCK_None;
if (CallInst *CI = dyn_cast<CallInst>(TheCall))		if (CallInst *CI = dyn_cast<CallInst>(TheCall))
CallSiteTailKind = CI->getTailCallKind();		CallSiteTailKind = CI->getTailCallKind();

for (Function::iterator BB = FirstNewBlock, E = Caller->end(); BB != E;		for (Function::iterator BB = FirstNewBlock, E = Caller->end(); BB != E;
++BB) {		++BB) {
for (Instruction &I : *BB) {		for (Instruction &I : *BB) {
CallInst *CI = dyn_cast<CallInst>(&I);		CallInst *CI = dyn_cast<CallInst>(&I);
if (!CI)		if (!CI)
continue;		continue;

		if (Function *F = CI->getCalledFunction())
		InlinedDeoptimizeCalls \|=
		F->getIntrinsicID() == Intrinsic::experimental_deoptimize;

// We need to reduce the strength of any inlined tail calls. For		// We need to reduce the strength of any inlined tail calls. For
// musttail, we have to avoid introducing potential unbounded stack		// musttail, we have to avoid introducing potential unbounded stack
// growth. For example, if functions 'f' and 'g' are mutually recursive		// growth. For example, if functions 'f' and 'g' are mutually recursive
// with musttail, we can inline 'g' into 'f' so long as we preserve		// with musttail, we can inline 'g' into 'f' so long as we preserve
// musttail on the cloned call to 'f'. If either the inlined call site		// musttail on the cloned call to 'f'. If either the inlined call site
// or the cloned call site is not musttail, the program already has		// or the cloned call site is not musttail, the program already has
// one frame of stack growth, so it's safe to remove musttail. Here is		// one frame of stack growth, so it's safe to remove musttail. Here is
// a table of example transformations:		// a table of example transformations:
▲ Show 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	for (Function::iterator BB = FirstNewBlock->getIterator(),
} else {		} else {
auto *FPI = cast<FuncletPadInst>(I);		auto *FPI = cast<FuncletPadInst>(I);
if (isa<ConstantTokenNone>(FPI->getParentPad()))		if (isa<ConstantTokenNone>(FPI->getParentPad()))
FPI->setParentPad(CallSiteEHPad);		FPI->setParentPad(CallSiteEHPad);
}		}
}		}
}		}

		if (InlinedDeoptimizeCalls) {
		// We need to at least remove the deoptimizing returns from the Return set,
		// so that the control flow from those returns does not get merged into the
		// caller (but terminate it instead). If the caller's return type does not
		// match the callee's return type, we also need to change the return type of
		// the intrinsic.
		if (Caller->getReturnType() == TheCall->getType()) {
		auto NewEnd = remove_if(Returns, [](ReturnInst *RI) {
		return RI->getParent()->getTerminatingDeoptimizeCall() != nullptr;
		});
		Returns.erase(NewEnd, Returns.end());
		} else {
		SmallVector<ReturnInst *, 8> NormalReturns;
		Function *NewDeoptIntrinsic = Intrinsic::getDeclaration(
		Caller->getParent(), Intrinsic::experimental_deoptimize,
		{Caller->getReturnType()});

		for (ReturnInst *RI : Returns) {
		CallInst *DeoptCall = RI->getParent()->getTerminatingDeoptimizeCall();
		if (!DeoptCall) {
		NormalReturns.push_back(RI);
		continue;
		}

		auto *CurBB = RI->getParent();
		RI->eraseFromParent();

		SmallVector<Value *, 4> CallArgs(DeoptCall->arg_begin(),
		DeoptCall->arg_end());

		SmallVector<OperandBundleDef, 1> OpBundles;
		DeoptCall->getOperandBundlesAsDefs(OpBundles);
		DeoptCall->eraseFromParent();
		assert(!OpBundles.empty() &&
		"Expected at least the deopt operand bundle");

		IRBuilder<> Builder(CurBB);
		Value *NewDeoptCall =
		Builder.CreateCall(NewDeoptIntrinsic, CallArgs, OpBundles);
		if (NewDeoptCall->getType()->isVoidTy())
		Builder.CreateRetVoid();
		else
		Builder.CreateRet(NewDeoptCall);
		}

		// Leave behind the normal returns so we can merge control flow.
		std::swap(Returns, NormalReturns);
		}
		}

// Handle any inlined musttail call sites. In order for a new call site to be		// Handle any inlined musttail call sites. In order for a new call site to be
// musttail, the source of the clone and the inlined call site must have been		// musttail, the source of the clone and the inlined call site must have been
// musttail. Therefore it's safe to return without merging control into the		// musttail. Therefore it's safe to return without merging control into the
// phi below.		// phi below.
if (InlinedMustTailCalls) {		if (InlinedMustTailCalls) {
// Check if we need to bitcast the result of any musttail calls.		// Check if we need to bitcast the result of any musttail calls.
Type *NewRetTy = Caller->getReturnType();		Type *NewRetTy = Caller->getReturnType();
bool NeedBitCast = !TheCall->use_empty() && TheCall->getType() != NewRetTy;		bool NeedBitCast = !TheCall->use_empty() && TheCall->getType() != NewRetTy;
▲ Show 20 Lines • Show All 220 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/Inline/deoptimize-intrinsic.ll

				; RUN: opt -S -always-inline < %s \| FileCheck %s

				declare i8 @llvm.experimental.deoptimize.i8(...)

				define i8 @callee(i1* %c) alwaysinline {
				%c0 = load volatile i1, i1* %c
				br i1 %c0, label %left, label %right

				left:
				%c1 = load volatile i1, i1* %c
				br i1 %c1, label %lleft, label %lright

				lleft:
				%v0 = call i8(...) @llvm.experimental.deoptimize.i8(i32 1) [ "deopt"(i32 1) ]
				ret i8 %v0

				lright:
				ret i8 10

				right:
				%c2 = load volatile i1, i1* %c
				br i1 %c2, label %rleft, label %rright

				rleft:
				%v1 = call i8(...) @llvm.experimental.deoptimize.i8(i32 1, i32 300, float 500.0, <2 x i32*> undef) [ "deopt"(i32 1) ]
				ret i8 %v1

				rright:
				%v2 = call i8(...) @llvm.experimental.deoptimize.i8() [ "deopt"(i32 1) ]
				ret i8 %v2
				}

				define void @caller_0(i1* %c, i8* %ptr) {
				; CHECK-LABEL: @caller_0(
				entry:
				%v = call i8 @callee(i1* %c) [ "deopt"(i32 2) ]
				store i8 %v, i8* %ptr
				ret void

				; CHECK: lleft.i:
				; CHECK-NEXT: call void (...) @llvm.experimental.deoptimize.isVoid(i32 1) [ "deopt"(i32 2, i32 1) ]
				; CHECK-NEXT: ret void

				; CHECK: rleft.i:
				; CHECK-NEXT: call void (...) @llvm.experimental.deoptimize.isVoid(i32 1, i32 300, float 5.000000e+02, <2 x i32*> undef) [ "deopt"(i32 2, i32 1) ]
				; CHECK-NEXT: ret void

				; CHECK: rright.i:
				; CHECK-NEXT: call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"(i32 2, i32 1) ]
				; CHECK-NEXT: ret void

				; CHECK: callee.exit:
				; CHECK-NEXT: store i8 10, i8* %ptr
				; CHECK-NEXT: ret void

				}

				define i32 @caller_1(i1* %c, i8* %ptr) personality i8 3 {
				; CHECK-LABEL: @caller_1(
				entry:
				%v = invoke i8 @callee(i1* %c) [ "deopt"(i32 3) ] to label %normal
				unwind label %unwind

				; CHECK: lleft.i:
				; CHECK-NEXT: %0 = call i32 (...) @llvm.experimental.deoptimize.i32(i32 1) [ "deopt"(i32 3, i32 1) ]
				; CHECK-NEXT: ret i32 %0

				; CHECK: rleft.i:
				; CHECK-NEXT: %1 = call i32 (...) @llvm.experimental.deoptimize.i32(i32 1, i32 300, float 5.000000e+02, <2 x i32*> undef) [ "deopt"(i32 3, i32 1) ]
				; CHECK-NEXT: ret i32 %1

				; CHECK: rright.i:
				; CHECK-NEXT: %2 = call i32 (...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 3, i32 1) ]
				; CHECK-NEXT: ret i32 %2

				; CHECK: callee.exit:
				; CHECK-NEXT: br label %normal

				; CHECK: normal:
				; CHECK-NEXT: store i8 10, i8* %ptr
				; CHECK-NEXT: ret i32 42

				unwind:
				%lp = landingpad i32 cleanup
				ret i32 43

				normal:
				store i8 %v, i8* %ptr
				ret i32 42
				}

llvm/trunk/test/Verifier/deoptimize-intrinsic.ll

				; RUN: not opt -verify < %s 2>&1 \| FileCheck %s

				declare i8 @llvm.experimental.deoptimize.i8(...)
				declare void @llvm.experimental.deoptimize.isVoid(...)

				declare void @unknown()

				define void @f_notail() {
				entry:
				call void(...) @llvm.experimental.deoptimize.isVoid(i32 0) [ "deopt"() ]
				; CHECK: calls to experimental_deoptimize must be followed by a return
				call void @unknown()
				ret void
				}

				define void @f_nodeopt() {
				entry:
				call void(...) @llvm.experimental.deoptimize.isVoid()
				; CHECK: experimental_deoptimize must have exactly one "deopt" operand bundle
				ret void
				}

				define void @f_invoke() personality i8 3 {
				entry:
				invoke void(...) @llvm.experimental.deoptimize.isVoid(i32 0, float 0.0) to label %ok unwind label %not_ok
				; CHECK: experimental_deoptimize cannot be invoked

				ok:
				ret void

				not_ok:
				%0 = landingpad { i8*, i32 }
				filter [0 x i8*] zeroinitializer
				ret void
				}

				define i8 @f_incorrect_return() {
				entry:
				%val = call i8(...) @llvm.experimental.deoptimize.i8() [ "deopt"() ]
				; CHECK: calls to experimental_deoptimize must be followed by a return of the value computed by experimental_deoptimize
				ret i8 0
				}