Diff 67521

docs/Coroutines.rst

Show First 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
The LLVM IR for this coroutine looks like this:		The LLVM IR for this coroutine looks like this:

.. code-block:: none		.. code-block:: none

define i8* @f(i32 %n) {		define i8* @f(i32 %n) {
entry:		entry:
%size = call i32 @llvm.coro.size.i32()		%size = call i32 @llvm.coro.size.i32()
%alloc = call i8* @malloc(i32 %size)		%alloc = call i8* @malloc(i32 %size)
%hdl = call noalias i8* @llvm.coro.begin(i8* %alloc, i32 0, i8* null, i8* null)		%beg = call token @llvm.coro.begin(i8* %alloc, i8* null, i32 0, i8* null, i8* null)
		%hdl = call noalias i8* @llvm.coro.frame(token %beg)
br label %loop		br label %loop
loop:		loop:
%n.val = phi i32 [ %n, %entry ], [ %inc, %loop ]		%n.val = phi i32 [ %n, %entry ], [ %inc, %loop ]
%inc = add nsw i32 %n.val, 1		%inc = add nsw i32 %n.val, 1
call void @print(i32 %n.val)		call void @print(i32 %n.val)
%0 = call i8 @llvm.coro.suspend(token none, i1 false)		%0 = call i8 @llvm.coro.suspend(token none, i1 false)
switch i8 %0, label %suspend [i8 0, label %loop		switch i8 %0, label %suspend [i8 0, label %loop
i8 1, label %cleanup]		i8 1, label %cleanup]
cleanup:		cleanup:
%mem = call i8* @llvm.coro.free(i8* %hdl)		%mem = call i8* @llvm.coro.free(i8* %hdl)
call void @free(i8* %mem)		call void @free(i8* %mem)
br label %suspend		br label %suspend
suspend:		suspend:
call void @llvm.coro.end(i8* %hdl, i1 false)		call void @llvm.coro.end(i8* %hdl, i1 false)
ret i8* %hdl		ret i8* %hdl
}		}

The `entry` block establishes the coroutine frame. The `coro.size`_ intrinsic is		The `entry` block establishes the coroutine frame. The `coro.size`_ intrinsic is
lowered to a constant representing the size required for the coroutine frame.		lowered to a constant representing the size required for the coroutine frame.
The `coro.begin`_ intrinsic initializes the coroutine frame and returns the		The `coro.begin`_ intrinsic initializes the coroutine frame and returns the a
coroutine handle. The first parameter of `coro.begin` is given a block of memory		token that is used to obtain the coroutine handle via `coro.frame` intrinsic.
to be used if the coroutine frame needs to be allocated dynamically.		The first parameter of `coro.begin` is given a block of memory to be used if the
		coroutine frame needs to be allocated dynamically.

The `cleanup` block destroys the coroutine frame. The `coro.free`_ intrinsic,		The `cleanup` block destroys the coroutine frame. The `coro.free`_ intrinsic,
given the coroutine handle, returns a pointer of the memory block to be freed or		given the coroutine handle, returns a pointer of the memory block to be freed or
`null` if the coroutine frame was not allocated dynamically. The `cleanup`		`null` if the coroutine frame was not allocated dynamically. The `cleanup`
block is entered when coroutine runs to completion by itself or destroyed via		block is entered when coroutine runs to completion by itself or destroyed via
call to the `coro.destroy`_ intrinsic.		call to the `coro.destroy`_ intrinsic.

The `suspend` block contains code to be executed when coroutine runs to		The `suspend` block contains code to be executed when coroutine runs to
Show All 26 Lines
.. code-block:: text		.. code-block:: text

%f.frame = type { void (%f.frame), void (%f.frame), i32 }		%f.frame = type { void (%f.frame), void (%f.frame), i32 }

After resume and destroy parts are outlined, function `f` will contain only the		After resume and destroy parts are outlined, function `f` will contain only the
code responsible for creation and initialization of the coroutine frame and		code responsible for creation and initialization of the coroutine frame and
execution of the coroutine until a suspend point is reached:		execution of the coroutine until a suspend point is reached:

.. code-block:: llvm		.. code-block:: none

define i8* @f(i32 %n) {		define i8* @f(i32 %n) {
entry:		entry:
%alloc = call noalias i8* @malloc(i32 24)		%alloc = call noalias i8* @malloc(i32 24)
%0 = call noalias i8* @llvm.coro.begin(i8* %alloc, i32 0, i8* null, i8* null)		%beg = call token @llvm.coro.begin(i8* %alloc, i8* null, i32 0, i8* null, i8* null)
		%0 = call i8* @llvm.coro.frame(token %beg)
%frame = bitcast i8* %0 to %f.frame*		%frame = bitcast i8* %0 to %f.frame*
%1 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 0		%1 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 0
store void (%f.frame) @f.resume, void (%f.frame)* %1		store void (%f.frame) @f.resume, void (%f.frame)* %1
%2 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 1		%2 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 1
store void (%f.frame) @f.destroy, void (%f.frame)* %2		store void (%f.frame) @f.destroy, void (%f.frame)* %2

%inc = add nsw i32 %n, 1		%inc = add nsw i32 %n, 1
%inc.spill.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 2		%inc.spill.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 2
Show All 37 Lines
RAII idiom and is suitable for allocation elision optimization which avoid		RAII idiom and is suitable for allocation elision optimization which avoid
dynamic allocation by storing the coroutine frame as a static `alloca` in its		dynamic allocation by storing the coroutine frame as a static `alloca` in its
caller.		caller.

In the entry block, we will call `coro.alloc`_ intrinsic that will return `null`		In the entry block, we will call `coro.alloc`_ intrinsic that will return `null`
when dynamic allocation is required, and an address of an alloca on the caller's		when dynamic allocation is required, and an address of an alloca on the caller's
frame where coroutine frame can be stored if dynamic allocation is elided.		frame where coroutine frame can be stored if dynamic allocation is elided.

.. code-block:: llvm		.. code-block:: none

entry:		entry:
%elide = call i8* @llvm.coro.alloc()		%elide = call i8* @llvm.coro.alloc()
%need.dyn.alloc = icmp ne i8* %elide, null		%need.dyn.alloc = icmp ne i8* %elide, null
br i1 %need.dyn.alloc, label %coro.begin, label %dyn.alloc		br i1 %need.dyn.alloc, label %coro.begin, label %dyn.alloc
dyn.alloc:		dyn.alloc:
%size = call i32 @llvm.coro.size.i32()		%size = call i32 @llvm.coro.size.i32()
%alloc = call i8* @CustomAlloc(i32 %size)		%alloc = call i8* @CustomAlloc(i32 %size)
br label %coro.begin		br label %coro.begin
coro.begin:		coro.begin:
%phi = phi i8* [ %elide, %entry ], [ %alloc, %dyn.alloc ]		%phi = phi i8* [ %elide, %entry ], [ %alloc, %dyn.alloc ]
%hdl = call noalias i8* @llvm.coro.begin(i8* %phi, i32 0, i8* null, i8* null)		%beg = call token @llvm.coro.begin(i8* %phi, i8* null, i32 0, i8* null, i8* null)

In the cleanup block, we will make freeing the coroutine frame conditional on		In the cleanup block, we will make freeing the coroutine frame conditional on
`coro.free`_ intrinsic. If allocation is elided, `coro.free`_ returns `null`		`coro.free`_ intrinsic. If allocation is elided, `coro.free`_ returns `null`
thus skipping the deallocation code:		thus skipping the deallocation code:

.. code-block:: llvm		.. code-block:: llvm

cleanup:		cleanup:
▲ Show 20 Lines • Show All 173 Lines • ▼ Show 20 Lines	entry:
%need.dyn.alloc = icmp ne i8* %elide, null		%need.dyn.alloc = icmp ne i8* %elide, null
br i1 %need.dyn.alloc, label %coro.begin, label %dyn.alloc		br i1 %need.dyn.alloc, label %coro.begin, label %dyn.alloc
dyn.alloc:		dyn.alloc:
%size = call i32 @llvm.coro.size.i32()		%size = call i32 @llvm.coro.size.i32()
%alloc = call i8* @malloc(i32 %size)		%alloc = call i8* @malloc(i32 %size)
br label %coro.begin		br label %coro.begin
coro.begin:		coro.begin:
%phi = phi i8* [ %elide, %entry ], [ %alloc, %dyn.alloc ]		%phi = phi i8* [ %elide, %entry ], [ %alloc, %dyn.alloc ]
%hdl = call noalias i8* @llvm.coro.begin(i8* %phi, i32 0, i8* %pv, i8* null)		%beg = call token @llvm.coro.begin(i8* %phi, i8* %elide, i32 0, i8* %pv, i8* null)
		%hdl = call i8* @llvm.coro.frame(token %beg)
br label %loop		br label %loop
loop:		loop:
%n.val = phi i32 [ %n, %coro.begin ], [ %inc, %loop ]		%n.val = phi i32 [ %n, %coro.begin ], [ %inc, %loop ]
%inc = add nsw i32 %n.val, 1		%inc = add nsw i32 %n.val, 1
store i32 %n.val, i32* %promise		store i32 %n.val, i32* %promise
%0 = call i8 @llvm.coro.suspend(token none, i1 false)		%0 = call i8 @llvm.coro.suspend(token none, i1 false)
switch i8 %0, label %suspend [i8 0, label %loop		switch i8 %0, label %suspend [i8 0, label %loop
i8 1, label %cleanup]		i8 1, label %cleanup]
▲ Show 20 Lines • Show All 249 Lines • ▼ Show 20 Lines
Using this intrinsic on a coroutine that does not have a coroutine promise		Using this intrinsic on a coroutine that does not have a coroutine promise
leads to undefined behavior. It is possible to read and modify coroutine		leads to undefined behavior. It is possible to read and modify coroutine
promise of the coroutine which is currently executing. The coroutine author and		promise of the coroutine which is currently executing. The coroutine author and
a coroutine user are responsible to makes sure there is no data races.		a coroutine user are responsible to makes sure there is no data races.

Example:		Example:
""""""""		""""""""

.. code-block:: llvm		.. code-block:: text

define i8* @f(i32 %n) {		define i8* @f(i32 %n) {
entry:		entry:
%promise = alloca i32		%promise = alloca i32
%pv = bitcast i32* %promise to i8*		%pv = bitcast i32* %promise to i8*
...		...
; the third argument to coro.begin points to the coroutine promise.		; the fourth argument to coro.begin points to the coroutine promise.
%hdl = call noalias i8* @llvm.coro.begin(i8* %alloc, i32 0, i8* %pv, i8* null)		%beg = call token @llvm.coro.begin(i8* %alloc, i8* null, i32 0, i8* %pv, i8* null)
		%hdl = call noalias i8* @llvm.coro.frame(token %beg)
...		...
store i32 42, i32* %promise ; store something into the promise		store i32 42, i32* %promise ; store something into the promise
...		...
ret i8* %hdl		ret i8* %hdl
}		}

define i32 @main() {		define i32 @main() {
entry:		entry:
Show All 40 Lines
the coroutine frame.		the coroutine frame.

.. _coro.begin:		.. _coro.begin:

'llvm.coro.begin' Intrinsic		'llvm.coro.begin' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
::		::

declare i8* @llvm.coro.begin(i8* <mem>, i32 <align>, i8* <promise>, i8* <fnaddr>)		declare i8* @llvm.coro.begin(i8* <mem>, i8* <elide>, i32 <align>, i8* <promise>, i8* <fnaddr>)

Overview:		Overview:
"""""""""		"""""""""

The '``llvm.coro.begin``' intrinsic returns an address of the coroutine frame.		The '``llvm.coro.begin``' intrinsic captures coroutine initialization
		information and returns a token that can be used by `coro.frame` intrinsic to
		return an address of the coroutine frame.

Arguments:		Arguments:
""""""""""		""""""""""

The first argument is a pointer to a block of memory where coroutine frame		The first argument is a pointer to a block of memory where coroutine frame
will be stored.		will be stored.

The second argument provides information on the alignment of the memory returned		The second argument is either null or an SSA value of `coro.alloc` intrinsic.

		The third argument provides information on the alignment of the memory returned
by the allocation function and given to `coro.begin` by the first argument. If		by the allocation function and given to `coro.begin` by the first argument. If
this argument is 0, the memory is assumed to be aligned to 2 * sizeof(i8*).		this argument is 0, the memory is assumed to be aligned to 2 * sizeof(i8*).
This argument only accepts constants.		This argument only accepts constants.

The third argument, if not `null`, designates a particular alloca instruction to		The fourth argument, if not `null`, designates a particular alloca instruction to
be a `coroutine promise`_.		be a `coroutine promise`_.

The fourth argument is `null` before coroutine is split, and later is replaced		The fifth argument is `null` before coroutine is split, and later is replaced
to point to a private global constant array containing function pointers to		to point to a private global constant array containing function pointers to
outlined resume and destroy parts of the coroutine.		outlined resume and destroy parts of the coroutine.

Semantics:		Semantics:
""""""""""		""""""""""

Depending on the alignment requirements of the objects in the coroutine frame		Depending on the alignment requirements of the objects in the coroutine frame
and/or on the codegen compactness reasons the pointer returned from `coro.begin`		and/or on the codegen compactness reasons the pointer returned from `coro.frame`
may be at offset to the `%mem` argument. (This could be beneficial if		associated with a particular `coro.begin` may be at offset to the `%mem`
instructions that express relative access to data can be more compactly encoded		argument. (This could be beneficial if instructions that express relative access
with small positive and negative offsets).		to data can be more compactly encoded with small positive and negative offsets).

A frontend should emit exactly one `coro.begin` intrinsic per coroutine.		A frontend should emit exactly one `coro.begin` intrinsic per coroutine.

.. _coro.free:		.. _coro.free:

'llvm.coro.free' Intrinsic		'llvm.coro.free' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
::		::

declare i8* @llvm.coro.free(i8* <frame>)		declare i8* @llvm.coro.free(i8* <frame>)

Overview:		Overview:
"""""""""		"""""""""

The '``llvm.coro.free``' intrinsic returns a pointer to a block of memory where		The '``llvm.coro.free``' intrinsic returns a pointer to a block of memory where
coroutine frame is stored or `null` if this instance of a coroutine did not use		coroutine frame is stored or `null` if this instance of a coroutine did not use
dynamically allocated memory for its coroutine frame.		dynamically allocated memory for its coroutine frame.

Arguments:		Arguments:
""""""""""		""""""""""

A pointer to the coroutine frame. This should be the same pointer that was		A pointer to the coroutine frame. This should be the same pointer that was
returned by prior `coro.begin` call.		returned by prior `coro.frame` call.

Example (custom deallocation function):		Example (custom deallocation function):
"""""""""""""""""""""""""""""""""""""""		"""""""""""""""""""""""""""""""""""""""

.. code-block:: llvm		.. code-block:: llvm

cleanup:		cleanup:
%mem = call i8* @llvm.coro.free(i8* %frame)		%mem = call i8* @llvm.coro.free(i8* %frame)
Show All 38 Lines
Semantics:		Semantics:
""""""""""		""""""""""

If the coroutine is eligible for heap elision, this intrinsic is lowered to an		If the coroutine is eligible for heap elision, this intrinsic is lowered to an
alloca storing the coroutine frame. Otherwise, it is lowered to constant `null`.		alloca storing the coroutine frame. Otherwise, it is lowered to constant `null`.

A frontend should emit at most one `coro.alloc` intrinsic per coroutine.		A frontend should emit at most one `coro.alloc` intrinsic per coroutine.

		If `coro.alloc` is present, the second parameter to `coro.begin` should refer
		to it.

Example:		Example:
""""""""		""""""""

.. code-block:: llvm		.. code-block:: text

entry:		entry:
%elide = call i8* @llvm.coro.alloc()		%elide = call i8* @llvm.coro.alloc()
%0 = icmp ne i8* %elide, null		%0 = icmp ne i8* %elide, null
br i1 %0, label %coro.begin, label %coro.alloc		br i1 %0, label %coro.begin, label %coro.alloc

coro.alloc:		coro.alloc:
%frame.size = call i32 @llvm.coro.size()		%frame.size = call i32 @llvm.coro.size()
%alloc = call i8* @MyAlloc(i32 %frame.size)		%alloc = call i8* @MyAlloc(i32 %frame.size)
br label %coro.begin		br label %coro.begin

coro.begin:		coro.begin:
%phi = phi i8* [ %elide, %entry ], [ %alloc, %coro.alloc ]		%phi = phi i8* [ %elide, %entry ], [ %alloc, %coro.alloc ]
%frame = call i8* @llvm.coro.begin(i8* %phi, i32 0, i8* null, i8* null)		%beg = call token @llvm.coro.begin(i8* %phi, i8* %elide, i32 0, i8* null, i8* null)
		%frame = call i8* @llvm.coro.frame(token %beg)

.. _coro.frame:		.. _coro.frame:

'llvm.coro.frame' Intrinsic		'llvm.coro.frame' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
::		::

declare i8* @llvm.coro.frame()		declare i8* @llvm.coro.frame()

Overview:		Overview:
"""""""""		"""""""""

The '``llvm.coro.frame``' intrinsic returns an address of the coroutine frame of		The '``llvm.coro.frame``' intrinsic returns an address of the coroutine frame of
the enclosing coroutine.		the enclosing coroutine.

Arguments:		Arguments:
""""""""""		""""""""""

None		A token that refers to `coro.begin` instruction.

Semantics:		Semantics:
""""""""""		""""""""""

This intrinsic is lowered to refer to the `coro.begin`_ instruction. This is		This intrinsic is lowered to refer to address of the coroutine frame.
a frontend convenience intrinsic that makes it easier to refer to the
coroutine frame.

.. _coro.end:		.. _coro.end:

'llvm.coro.end' Intrinsic		'llvm.coro.end' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
::		::

declare void @llvm.coro.end(i8* <handle>, i1 <unwind>)		declare void @llvm.coro.end(i8* <handle>, i1 <unwind>)
▲ Show 20 Lines • Show All 242 Lines • ▼ Show 20 Lines
---------		---------
The pass CoroSplit buides coroutine frame and outlines resume and destroy parts		The pass CoroSplit buides coroutine frame and outlines resume and destroy parts
into separate functions.		into separate functions.

CoroElide		CoroElide
---------		---------
The pass CoroElide examines if the inlined coroutine is eligible for heap		The pass CoroElide examines if the inlined coroutine is eligible for heap
allocation elision optimization. If so, it replaces `coro.alloc` and		allocation elision optimization. If so, it replaces `coro.alloc` and
`coro.begin` intrinsic with an address of a coroutine frame placed on its caller		`coro.frame` intrinsic with an address of a coroutine frame placed on its caller
and replaces `coro.free` intrinsics with `null` to remove the deallocation code.		and replaces `coro.free` intrinsics with `null` to remove the deallocation code.
This pass also replaces `coro.resume` and `coro.destroy` intrinsics with direct		This pass also replaces `coro.resume` and `coro.destroy` intrinsics with direct
calls to resume and destroy functions for a particular coroutine where possible.		calls to resume and destroy functions for a particular coroutine where possible.

CoroCleanup		CoroCleanup
-----------		-----------
This pass runs late to lower all coroutine related intrinsics not replaced by		This pass runs late to lower all coroutine related intrinsics not replaced by
earlier passes.		earlier passes.

Upstreaming sequence (rough plan)		Upstreaming sequence (rough plan)
=================================		=================================
#. Add documentation.		#. Add documentation.
#. Add coroutine intrinsics.		#. Add coroutine intrinsics.
#. Add empty coroutine passes. <== we are here		#. Add empty coroutine passes.
#. Add coroutine devirtualization + tests.		#. Add coroutine devirtualization + tests.
#. Add CGSCC restart trigger + tests.		#. Add CGSCC restart trigger + tests.
#. Add coroutine heap elision + tests.		#. Add coroutine heap elision + tests.
#. Add custom allocation heap elision + tests.		#. Add custom allocation heap elision + tests. <== we are here
#. Add coroutine splitting logic + tests.		#. Add coroutine splitting logic + tests.
#. Add simple coroutine frame builder + tests.		#. Add simple coroutine frame builder + tests.
#. Add the rest of the logic + tests. (Maybe split further as needed).		#. Add the rest of the logic + tests. (Maybe split further as needed).

Areas Requiring Attention		Areas Requiring Attention
=========================		=========================
#. A coroutine frame is bigger than it could be. Adding stack packing and stack		#. A coroutine frame is bigger than it could be. Adding stack packing and stack
coloring like optimization on the coroutine frame will result in tighter		coloring like optimization on the coroutine frame will result in tighter
Show All 21 Lines

include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 597 Lines • ▼ Show 20 Lines	def int_experimental_gc_relocate : Intrinsic<[llvm_any_ty],
[IntrReadMem]>;		[IntrReadMem]>;

//===------------------------ Coroutine Intrinsics ---------------===//		//===------------------------ Coroutine Intrinsics ---------------===//
// These are documented in docs/Coroutines.rst		// These are documented in docs/Coroutines.rst

// Coroutine Structure Intrinsics.		// Coroutine Structure Intrinsics.

def int_coro_alloc : Intrinsic<[llvm_ptr_ty], [], []>;		def int_coro_alloc : Intrinsic<[llvm_ptr_ty], [], []>;
def int_coro_begin : Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_i32_ty,		def int_coro_begin : Intrinsic<[llvm_token_ty], [llvm_ptr_ty, llvm_ptr_ty,
llvm_ptr_ty, llvm_ptr_ty],		llvm_i32_ty, llvm_ptr_ty, llvm_ptr_ty],
[WriteOnly<0>, ReadNone<2>, ReadOnly<3>,		[WriteOnly<0>, WriteOnly<0>,
NoCapture<3>]>;		ReadNone<3>, ReadOnly<4>, NoCapture<4>]>;

def int_coro_free : Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty],		def int_coro_free : Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty],
[IntrArgMemOnly, ReadOnly<0>, NoCapture<0>]>;		[IntrArgMemOnly, ReadOnly<0>, NoCapture<0>]>;
def int_coro_end : Intrinsic<[], [llvm_ptr_ty, llvm_i1_ty], []>;		def int_coro_end : Intrinsic<[], [llvm_ptr_ty, llvm_i1_ty], []>;

def int_coro_frame : Intrinsic<[llvm_ptr_ty], [], [IntrNoMem]>;		def int_coro_frame : Intrinsic<[llvm_ptr_ty], [llvm_token_ty], [IntrNoMem]>;
def int_coro_size : Intrinsic<[llvm_anyint_ty], [], [IntrNoMem]>;		def int_coro_size : Intrinsic<[llvm_anyint_ty], [], [IntrNoMem]>;

def int_coro_save : Intrinsic<[llvm_token_ty], [llvm_ptr_ty], []>;		def int_coro_save : Intrinsic<[llvm_token_ty], [llvm_ptr_ty], []>;
def int_coro_suspend : Intrinsic<[llvm_i8_ty], [llvm_token_ty, llvm_i1_ty], []>;		def int_coro_suspend : Intrinsic<[llvm_i8_ty], [llvm_token_ty, llvm_i1_ty], []>;

def int_coro_param : Intrinsic<[llvm_i1_ty], [llvm_ptr_ty, llvm_ptr_ty],		def int_coro_param : Intrinsic<[llvm_i1_ty], [llvm_ptr_ty, llvm_ptr_ty],
[IntrNoMem, ReadNone<0>, ReadNone<1>]>;		[IntrNoMem, ReadNone<0>, ReadNone<1>]>;

▲ Show 20 Lines • Show All 123 Lines • Show Last 20 Lines

lib/Transforms/Coroutines/CoroElide.cpp

Show All 10 Lines
// to coroutine sub-functions.		// to coroutine sub-functions.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "CoroInternal.h"		#include "CoroInternal.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/IR/InstIterator.h"		#include "llvm/IR/InstIterator.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
		#include "llvm/Support/ErrorHandling.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "coro-elide"		#define DEBUG_TYPE "coro-elide"

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Top Level Driver		// Top Level Driver
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

namespace {		namespace {
struct CoroElide : FunctionPass {		struct CoroElide : FunctionPass {
static char ID;		static char ID;
CoroElide() : FunctionPass(ID) {}		CoroElide() : FunctionPass(ID) {}

bool NeedsToRun = false;		bool NeedsToRun = false;

bool doInitialization(Module &M) override {		bool doInitialization(Module &M) override {
NeedsToRun = coro::declaresIntrinsics(M, {"llvm.coro.begin"});		NeedsToRun = coro::declaresIntrinsics(M, {"llvm.coro.begin"});
return false;		return false;
}		}

bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;
void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<AAResultsWrapperPass>();		AU.addRequired<AAResultsWrapperPass>();
AU.setPreservesCFG();		AU.setPreservesCFG();
}		}
};		};
}		}

char CoroElide::ID = 0;		char CoroElide::ID = 0;
INITIALIZE_PASS_BEGIN(		INITIALIZE_PASS_BEGIN(
CoroElide, "coro-elide",		CoroElide, "coro-elide",
"Coroutine frame allocation elision and indirect calls replacement", false,		"Coroutine frame allocation elision and indirect calls replacement", false,
false)		false)
INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)		INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
INITIALIZE_PASS_END(		INITIALIZE_PASS_END(
CoroElide, "coro-elide",		CoroElide, "coro-elide",
"Coroutine frame allocation elision and indirect calls replacement", false,		"Coroutine frame allocation elision and indirect calls replacement", false,
false)		false)

Pass *llvm::createCoroElidePass() { return new CoroElide(); }		Pass *llvm::createCoroElidePass() { return new CoroElide(); }

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Implementation		// Implementation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Go through the list of coro.subfn.addr intrinsics and replace them with the		// Go through the list of coro.subfn.addr intrinsics and replace them with the
Show All 15 Lines	if (ValueTy != IntrTy) {
Value = ConstantExpr::getBitCast(Value, IntrTy);		Value = ConstantExpr::getBitCast(Value, IntrTy);
}		}

// Now the value type matches the type of the intrinsic. Replace them all!		// Now the value type matches the type of the intrinsic. Replace them all!
for (CoroSubFnInst *I : Users)		for (CoroSubFnInst *I : Users)
replaceAndRecursivelySimplify(I, Value);		replaceAndRecursivelySimplify(I, Value);
}		}

		// See if any operand of the call instruction references the coroutine frame.
		static bool operandReferences(CallInst CI, AllocaInst Frame, AAResults &AA) {
		for (Value *Op : CI->operand_values())
		if (AA.alias(Op, Frame) != NoAlias)
		return true;
		return false;
		majnemerUnsubmitted Done Reply Inline Actions This doesn't seem conservative enough because you are only check if the call's operands alias the alloca. I could imagine the call referencing the coroutine frame via other means (load/store via global). I think `getModRefInfo() != MRI_NoModRef` would capture this better. majnemer: This doesn't seem conservative enough because you are only check if the call's operands alias…
		}

		// Look for any tail calls referencing the coroutine frame and remove tail
		// attribute from them, since now coroutine frame resides on the stack and tail
		// call implies that the function does not references anything on the stack.
		static void removeTailCallAttribute(AllocaInst *Frame, AAResults &AA) {
		Function &F = *Frame->getFunction();
		MemoryLocation Mem(Frame);
		for (Instruction &I : instructions(F))
		if (auto *Call = dyn_cast<CallInst>(&I))
		if (Call->isTailCall() && operandReferences(Call, Frame, AA)) {
		majnemerUnsubmitted Done Reply Inline Actions Should we `report_fatal_error` if the `CallSite` is `must_tail`? majnemer: Should we `report_fatal_error` if the `CallSite` is `must_tail`?
		// FIXME: If we ever hit this check. Evaluate whether it is more
		// appropriate to retain musttail and allow the code to compile.
		if (Call->isMustTailCall())
		report_fatal_error("Call referring to the coroutine frame cannot be "
		"marked as musttail");
		majnemerUnsubmitted Done Reply Inline Actions `auto ` majnemer:* `auto *`
		Call->setTailCall(false);
		}
		}

		// Given a resume function @f.resume(%f.frame* %frame), returns %f.frame type.
		static Type getFrameType(Function Resume) {
		auto *ArgType = Resume->getArgumentList().front().getType();
		return cast<PointerType>(ArgType)->getElementType();
		}

		// Finds first non alloca instruction in the entry block of a function.
		static Instruction getFirstNonAllocaInTheEntryBlock(Function F) {
		for (Instruction &I : F->getEntryBlock())
		if (!isa<AllocaInst>(&I))
		return &I;
		llvm_unreachable("no terminator in the entry block");
		}

		// To elide heap allocations we need to suppress code blocks guarded by
		majnemerUnsubmitted Done Reply Inline Actions Do we have to worry about the alignment of this alloca? majnemer: Do we have to worry about the alignment of this alloca?
		GorNishanovAuthorUnsubmitted Not Done Reply Inline Actions Added comment: // FIXME: Design how to transmit alignment information for every alloca that // is spilled into the coroutine frame and recreate the alignment information // here. Possibly we will need to do a mini SROA here and break the coroutine // frame into individual AllocaInst recreating the original alignment. GorNishanov: Added comment: // FIXME: Design how to transmit alignment information for every alloca that…
		// llvm.coro.alloc and llvm.coro.free instructions.
		static void elideHeapAllocations(CoroBeginInst CoroBegin, Type FrameTy,
		CoroAllocInst *AllocInst, AAResults &AA) {
		LLVMContext &C = CoroBegin->getContext();
		auto *InsertPt = getFirstNonAllocaInTheEntryBlock(CoroBegin->getFunction());

		// FIXME: Design how to transmit alignment information for every alloca that
		// is spilled into the coroutine frame and recreate the alignment information
		// here. Possibly we will need to do a mini SROA here and break the coroutine
		// frame into individual AllocaInst recreating the original alignment.
		auto *Frame = new AllocaInst(FrameTy, "", InsertPt);
		auto *FrameVoidPtr =
		new BitCastInst(Frame, Type::getInt8PtrTy(C), "vFrame", InsertPt);

		// Replacing llvm.coro.alloc with non-null value will suppress dynamic
		// allocation as it is expected for the frontend to generate the code that
		// looks like:
		// mem = coro.alloc();
		// if (!mem) mem = malloc(coro.size());
		// coro.begin(mem, ...)
		AllocInst->replaceAllUsesWith(FrameVoidPtr);
		AllocInst->eraseFromParent();

		// To suppress deallocation code, we replace all llvm.coro.free intrinsics
		// associated with this coro.begin with null constant.
		auto *NullPtr = ConstantPointerNull::get(Type::getInt8PtrTy(C));
		coro::replaceAllCoroFrees(CoroBegin, NullPtr);
		CoroBegin->lowerTo(FrameVoidPtr);

		// Since now coroutine frame lives on the stack we need to make sure that
		// any tail call referencing it, must be made non-tail call.
		removeTailCallAttribute(Frame, AA);
		}

// See if there are any coro.subfn.addr intrinsics directly referencing		// See if there are any coro.subfn.addr intrinsics directly referencing
// the coro.begin. If found, replace them with an appropriate coroutine		// the coro.begin. If found, replace them with an appropriate coroutine
// subfunction associated with that coro.begin.		// subfunction associated with that coro.begin.
static bool replaceIndirectCalls(CoroBeginInst *CoroBegin, AAResults& AA) {		static bool replaceIndirectCalls(CoroBeginInst *CoroBegin, AAResults &AA) {
SmallVector<CoroSubFnInst *, 8> ResumeAddr;		SmallVector<CoroSubFnInst *, 8> ResumeAddr;
SmallVector<CoroSubFnInst *, 8> DestroyAddr;		SmallVector<CoroSubFnInst *, 8> DestroyAddr;

for (User *U : CoroBegin->users()) {		for (User *CF : CoroBegin->users()) {
		assert(isa<CoroFrameInst>(CF) &&
		"CoroBegin can be only used by coro.frame instructions");
		for (User *U : CF->users()) {
if (auto *II = dyn_cast<CoroSubFnInst>(U)) {		if (auto *II = dyn_cast<CoroSubFnInst>(U)) {
switch (II->getIndex()) {		switch (II->getIndex()) {
case CoroSubFnInst::ResumeIndex:		case CoroSubFnInst::ResumeIndex:
ResumeAddr.push_back(II);		ResumeAddr.push_back(II);
break;		break;
case CoroSubFnInst::DestroyIndex:		case CoroSubFnInst::DestroyIndex:
DestroyAddr.push_back(II);		DestroyAddr.push_back(II);
break;		break;
default:		default:
llvm_unreachable("unexpected coro.subfn.addr constant");		llvm_unreachable("unexpected coro.subfn.addr constant");
}		}
}		}
}		}
		}
if (ResumeAddr.empty() && DestroyAddr.empty())		if (ResumeAddr.empty() && DestroyAddr.empty())
return false;		return false;

// PostSplit coro.begin refers to an array of subfunctions in its Info		// PostSplit coro.begin refers to an array of subfunctions in its Info
// argument.		// argument.
ConstantArray *Resumers = CoroBegin->getInfo().Resumers;		ConstantArray *Resumers = CoroBegin->getInfo().Resumers;
assert(Resumers && "PostSplit coro.begin Info argument must refer to an array"		assert(Resumers && "PostSplit coro.begin Info argument must refer to an array"
"of coroutine subfunctions");		"of coroutine subfunctions");
auto *ResumeAddrConstant =		auto *ResumeAddrConstant =
ConstantExpr::getExtractValue(Resumers, CoroSubFnInst::ResumeIndex);		ConstantExpr::getExtractValue(Resumers, CoroSubFnInst::ResumeIndex);
replaceWithConstant(ResumeAddrConstant, ResumeAddr);		replaceWithConstant(ResumeAddrConstant, ResumeAddr);

if (DestroyAddr.empty())		if (DestroyAddr.empty())
return true;		return true;

auto *DestroyAddrConstant =		auto *DestroyAddrConstant =
ConstantExpr::getExtractValue(Resumers, CoroSubFnInst::DestroyIndex);		ConstantExpr::getExtractValue(Resumers, CoroSubFnInst::DestroyIndex);
replaceWithConstant(DestroyAddrConstant, DestroyAddr);		replaceWithConstant(DestroyAddrConstant, DestroyAddr);
#if 0
// If llvm.coro.begin refers to llvm.coro.alloc, we can elide the allocation.
auto *AllocInst = CoroBegin->getAlloc();

if (AllocInst) {		// If llvm.coro.begin refers to llvm.coro.alloc, we can elide the allocation.
auto FrameTy = getFrameType(cast<Function>(ResumeAddrConstant));		if (auto *AllocInst = CoroBegin->getAlloc()) {
		majnemerUnsubmitted Done Reply Inline Actions Is this still safe if there are multiple coro.begin calls and some of them refer to llvm.coro.alloc and others do not? majnemer: Is this still safe if there are multiple coro.begin calls and some of them refer to llvm.coro.
		GorNishanovAuthorUnsubmitted Not Done Reply Inline Actions Addressed by making coro.begin return the token and coro.frame taken that token returning the coroutine frame address. This should prevent coro.begin being duplicated. GorNishanov: Addressed by making coro.begin return the token and coro.frame taken that token returning the…
		// FIXME: The check above is overly lax. It only checks for whether we have
		// an ability to elide heap allocations, not whether it is safe to do so.
		// We need to do something like:
		// If for every exit from the function where coro.begin is
		// live, there is a coro.free or coro.destroy dominating that exit block,
		// then it is safe to elide heap allocation, since the lifetime of coroutine
		// is fully enclosed in its caller.
		auto *FrameTy = getFrameType(cast<Function>(ResumeAddrConstant));
		majnemerUnsubmitted Done Reply Inline Actions We would typically sink this assignment into the if. majnemer: We would typically sink this assignment into the if.
elideHeapAllocations(CoroBegin, FrameTy, AllocInst, AA);		elideHeapAllocations(CoroBegin, FrameTy, AllocInst, AA);
		majnemerUnsubmitted Done Reply Inline Actions `auto ` majnemer:* `auto *`
}		}
#endif
return true;		return true;
}		}

// See if there are any coro.subfn.addr instructions referring to coro.devirt		// See if there are any coro.subfn.addr instructions referring to coro.devirt
// trigger, if so, replace them with a direct call to devirt trigger function.		// trigger, if so, replace them with a direct call to devirt trigger function.
static bool replaceDevirtTrigger(Function &F) {		static bool replaceDevirtTrigger(Function &F) {
SmallVector<CoroSubFnInst *, 1> DevirtAddr;		SmallVector<CoroSubFnInst *, 1> DevirtAddr;
for (auto &I : instructions(F))		for (auto &I : instructions(F))
Show All 23 Lines	bool CoroElide::runOnFunction(Function &F) {
for (auto &I : instructions(F))		for (auto &I : instructions(F))
if (auto *CB = dyn_cast<CoroBeginInst>(&I))		if (auto *CB = dyn_cast<CoroBeginInst>(&I))
if (CB->getInfo().isPostSplit())		if (CB->getInfo().isPostSplit())
CoroBegins.push_back(CB);		CoroBegins.push_back(CB);

if (CoroBegins.empty())		if (CoroBegins.empty())
return Changed;		return Changed;

AAResults& AA = getAnalysis<AAResultsWrapperPass>().getAAResults();		AAResults &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();
for (auto *CB : CoroBegins)		for (auto *CB : CoroBegins)
Changed \|= replaceIndirectCalls(CB, AA);		Changed \|= replaceIndirectCalls(CB, AA);

return Changed;		return Changed;
}		}

lib/Transforms/Coroutines/CoroInstr.h

Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	public:
static inline bool classof(const IntrinsicInst *I) {		static inline bool classof(const IntrinsicInst *I) {
return I->getIntrinsicID() == Intrinsic::coro_subfn_addr;		return I->getIntrinsicID() == Intrinsic::coro_subfn_addr;
}		}
static inline bool classof(const Value *V) {		static inline bool classof(const Value *V) {
return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
}		}
};		};

		/// This represents the llvm.coro.alloc instruction.
		class LLVM_LIBRARY_VISIBILITY CoroAllocInst : public IntrinsicInst {
		public:
		// Methods to support type inquiry through isa, cast, and dyn_cast:
		static inline bool classof(const IntrinsicInst *I) {
		return I->getIntrinsicID() == Intrinsic::coro_alloc;
		}
		static inline bool classof(const Value *V) {
		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
		}
		};

		/// This represents the llvm.coro.frame instruction.
		class LLVM_LIBRARY_VISIBILITY CoroFrameInst : public IntrinsicInst {
		public:
		// Methods to support type inquiry through isa, cast, and dyn_cast:
		static inline bool classof(const IntrinsicInst *I) {
		return I->getIntrinsicID() == Intrinsic::coro_frame;
		}
		static inline bool classof(const Value *V) {
		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
		}
		};

		/// This represents the llvm.coro.free instruction.
		class LLVM_LIBRARY_VISIBILITY CoroFreeInst : public IntrinsicInst {
		public:
		// Methods to support type inquiry through isa, cast, and dyn_cast:
		static inline bool classof(const IntrinsicInst *I) {
		return I->getIntrinsicID() == Intrinsic::coro_free;
		}
		static inline bool classof(const Value *V) {
		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
		}
		};

/// This class represents the llvm.coro.begin instruction.		/// This class represents the llvm.coro.begin instruction.
class LLVM_LIBRARY_VISIBILITY CoroBeginInst : public IntrinsicInst {		class LLVM_LIBRARY_VISIBILITY CoroBeginInst : public IntrinsicInst {
enum { MemArg, AlignArg, PromiseArg, InfoArg };		enum { MemArg, ElideArg, AlignArg, PromiseArg, InfoArg };

public:		public:
		CoroAllocInst *getAlloc() const {
		auto *V = getArgOperand(ElideArg);
		if (isa<ConstantPointerNull>(V))
		return nullptr;

		return cast<CoroAllocInst>(V);
		}
		majnemerUnsubmitted Not Done Reply Inline Actions I'd phrase this another way to make it more conservative: CoroAllocInst getAlloc() const { if (auto CAI = dyn_cast<CoroAllocInst>getArgOperand(ElideArg->stripPointerCasts())) return CAI; return nullptr; } This way, we won't crash if `undef` or a `bitcast` or `null` or some other weird operation is sitting in the elide arg. majnemer: I'd phrase this another way to make it more conservative: CoroAllocInst *getAlloc() const {…
		Value *getMem() const { return getArgOperand(MemArg); }

Constant *getRawInfo() const {		Constant *getRawInfo() const {
return cast<Constant>(getArgOperand(InfoArg)->stripPointerCasts());		return cast<Constant>(getArgOperand(InfoArg)->stripPointerCasts());
}		}

void setInfo(Constant *C) { setArgOperand(InfoArg, C); }		void setInfo(Constant *C) { setArgOperand(InfoArg, C); }

// Info argument of coro.begin is		// Info argument of coro.begin is
// fresh out of the frontend: null ;		// fresh out of the frontend: null ;
Show All 25 Lines	Info getInfo() const {
Constant *Initializer = GV->getInitializer();		Constant *Initializer = GV->getInitializer();
if ((Result.OutlinedParts = dyn_cast<ConstantStruct>(Initializer)))		if ((Result.OutlinedParts = dyn_cast<ConstantStruct>(Initializer)))
return Result;		return Result;

Result.Resumers = cast<ConstantArray>(Initializer);		Result.Resumers = cast<ConstantArray>(Initializer);
return Result;		return Result;
}		}

		// Replaces all coro.frame intrinsics that are associated with this coro.begin
		// to a replacement value and removes coro.begin and all of the coro.frame
		// intrinsics.
		void lowerTo(Value* Replacement) {
		SmallVector<CoroFrameInst*, 4> FrameInsts;
		for (auto *CF : this->users())
		FrameInsts.push_back(cast<CoroFrameInst>(CF));

		for (auto *CF : FrameInsts) {
		CF->replaceAllUsesWith(Replacement);
		CF->eraseFromParent();
		}

		this->eraseFromParent();
		}

// Methods for support type inquiry through isa, cast, and dyn_cast:		// Methods for support type inquiry through isa, cast, and dyn_cast:
static inline bool classof(const IntrinsicInst *I) {		static inline bool classof(const IntrinsicInst *I) {
return I->getIntrinsicID() == Intrinsic::coro_begin;		return I->getIntrinsicID() == Intrinsic::coro_begin;
}		}
static inline bool classof(const Value *V) {		static inline bool classof(const Value *V) {
return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
}		}
};		};

} // End namespace llvm.		} // End namespace llvm.

lib/Transforms/Coroutines/CoroInternal.h

	Show All 36 Lines
	#define UNPREPARED_FOR_SPLIT "0"			#define UNPREPARED_FOR_SPLIT "0"
	#define PREPARED_FOR_SPLIT "1"			#define PREPARED_FOR_SPLIT "1"

	#define CORO_DEVIRT_TRIGGER_FN "coro.devirt.trigger"			#define CORO_DEVIRT_TRIGGER_FN "coro.devirt.trigger"

	namespace coro {			namespace coro {

	bool declaresIntrinsics(Module &M, std::initializer_list<StringRef>);			bool declaresIntrinsics(Module &M, std::initializer_list<StringRef>);
				void replaceAllCoroFrees(CoroBeginInst CB, Value Replacement);

	// Keeps data and helper functions for lowering coroutine intrinsics.			// Keeps data and helper functions for lowering coroutine intrinsics.
	struct LowererBase {			struct LowererBase {
	Module &TheModule;			Module &TheModule;
	LLVMContext &Context;			LLVMContext &Context;
	FunctionType *const ResumeFnType;			FunctionType *const ResumeFnType;

	LowererBase(Module &M);			LowererBase(Module &M);
	Value makeSubFnCall(Value Arg, int Index, Instruction *InsertPt);			Value makeSubFnCall(Value Arg, int Index, Instruction *InsertPt);
	};			};

	} // End namespace coro.			} // End namespace coro.
	} // End namespace llvm			} // End namespace llvm

	#endif			#endif

lib/Transforms/Coroutines/Coroutines.cpp

Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	bool coro::declaresIntrinsics(Module &M,
for (StringRef Name : List) {		for (StringRef Name : List) {
assert(isCoroutineIntrinsicName(Name) && "not a coroutine intrinsic");		assert(isCoroutineIntrinsicName(Name) && "not a coroutine intrinsic");
if (M.getNamedValue(Name))		if (M.getNamedValue(Name))
return true;		return true;
}		}

return false;		return false;
}		}

		// Find all llvm.coro.free instructions associated with the provided coro.begin
		// and replace them with the provided replacement value.
		void coro::replaceAllCoroFrees(CoroBeginInst CB, Value Replacement) {
		SmallVector<CoroFreeInst *, 4> CoroFrees;
		for (User *FramePtr: CB->users())
		for (User *U : FramePtr->users())
		if (auto *CF = dyn_cast<CoroFreeInst>(U))
		CoroFrees.push_back(CF);

		if (CoroFrees.empty())
		return;

		for (CoroFreeInst *CF : CoroFrees) {
		CF->replaceAllUsesWith(Replacement);
		CF->eraseFromParent();
		}
		}

test/Transforms/Coroutines/coro-elide.ll

; Tests that the coro.destroy and coro.resume are devirtualized where possible,		; Tests that the coro.destroy and coro.resume are devirtualized where possible,
; SCC pipeline restarts and inlines the direct calls.		; SCC pipeline restarts and inlines the direct calls.
; RUN: opt < %s -S -inline -coro-elide \| FileCheck %s		; RUN: opt < %s -S -inline -coro-elide -dce \| FileCheck %s

declare void @print(i32) nounwind		declare void @print(i32) nounwind

; resume part of the coroutine		; resume part of the coroutine
define fastcc void @f.resume(i8*) {		define fastcc void @f.resume(i8*) {
tail call void @print(i32 0)		tail call void @print(i32 0)
ret void		ret void
}		}

; destroy part of the coroutine		; destroy part of the coroutine
define fastcc void @f.destroy(i8*) {		define fastcc void @f.destroy(i8*) {
tail call void @print(i32 1)		tail call void @print(i32 1)
ret void		ret void
}		}

@f.resumers = internal constant [2 x void (i8)] [void (i8) @f.resume,		@f.resumers = internal constant [2 x void (i8)] [void (i8) @f.resume,
void (i8) @f.destroy]		void (i8) @f.destroy]

; a coroutine start function		; a coroutine start function
define i8* @f() {		define i8* @f() {
entry:		entry:
%hdl = call i8* @llvm.coro.begin(i8* null, i32 0, i8* null,		%tok = call token @llvm.coro.begin(i8* null, i8* null, i32 0, i8* null,
i8* bitcast ([2 x void (i8)]* @f.resumers to i8*))		i8* bitcast ([2 x void (i8)]* @f.resumers to i8*))
		%hdl = call i8* @llvm.coro.frame(token %tok)
ret i8* %hdl		ret i8* %hdl
}		}

; CHECK-LABEL: @callResume(		; CHECK-LABEL: @callResume(
define void @callResume() {		define void @callResume() {
entry:		entry:
; CHECK: call i8* @llvm.coro.begin		; CHECK: call token @llvm.coro.begin
%hdl = call i8* @f()		%hdl = call i8* @f()

; CHECK-NEXT: call void @print(i32 0)		; CHECK-NEXT: call void @print(i32 0)
%0 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 0)		%0 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 0)
%1 = bitcast i8* %0 to void (i8)		%1 = bitcast i8* %0 to void (i8)
call fastcc void %1(i8* %hdl)		call fastcc void %1(i8* %hdl)

; CHECK-NEXT: call void @print(i32 1)		; CHECK-NEXT: call void @print(i32 1)
%2 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 1)		%2 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 1)
%3 = bitcast i8* %2 to void (i8)		%3 = bitcast i8* %2 to void (i8)
call fastcc void %3(i8* %hdl)		call fastcc void %3(i8* %hdl)

; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
ret void		ret void
}		}

; CHECK-LABEL: @eh(		; CHECK-LABEL: @eh(
define void @eh() personality i8* null {		define void @eh() personality i8* null {
entry:		entry:
; CHECK: call i8* @llvm.coro.begin		; CHECK: call token @llvm.coro.begin
%hdl = call i8* @f()		%hdl = call i8* @f()

; CHECK-NEXT: call void @print(i32 0)		; CHECK-NEXT: call void @print(i32 0)
%0 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 0)		%0 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 0)
%1 = bitcast i8* %0 to void (i8)		%1 = bitcast i8* %0 to void (i8)
invoke void %1(i8* %hdl)		invoke void %1(i8* %hdl)
to label %cont unwind label %ehcleanup		to label %cont unwind label %ehcleanup
cont:		cont:
ret void		ret void

ehcleanup:		ehcleanup:
%tok = cleanuppad within none []		%tok = cleanuppad within none []
cleanupret from %tok unwind to caller		cleanupret from %tok unwind to caller
}		}

; CHECK-LABEL: @no_devirt_info_null(		; CHECK-LABEL: @no_devirt_info_null(
; no devirtualization here, since coro.begin info parameter is null		; no devirtualization here, since coro.begin info parameter is null
define void @no_devirt_info_null() {		define void @no_devirt_info_null() {
entry:		entry:
%hdl = call i8* @llvm.coro.begin(i8* null, i32 0, i8* null, i8* null)		%tok = call token @llvm.coro.begin(i8* null, i8* null, i32 0, i8* null, i8* null)
		%hdl = call i8* @llvm.coro.frame(token %tok)

; CHECK: call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 0)		; CHECK: call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 0)
%0 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 0)		%0 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 0)
%1 = bitcast i8* %0 to void (i8)		%1 = bitcast i8* %0 to void (i8)
call fastcc void %1(i8* %hdl)		call fastcc void %1(i8* %hdl)

; CHECK: call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 1)		; CHECK: call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 1)
%2 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 1)		%2 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 1)
Show All 19 Lines	; CHECK: call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 1)
%3 = bitcast i8* %2 to void (i8)		%3 = bitcast i8* %2 to void (i8)
call fastcc void %3(i8* %hdl)		call fastcc void %3(i8* %hdl)

; CHECK: ret void		; CHECK: ret void
ret void		ret void
}		}


declare i8* @llvm.coro.begin(i8, i32, i8, i8*)		declare token @llvm.coro.begin(i8, i8, i32, i8, i8)
		declare i8* @llvm.coro.frame(token)
declare i8* @llvm.coro.subfn.addr(i8*, i8)		declare i8* @llvm.coro.subfn.addr(i8*, i8)

test/Transforms/Coroutines/coro-heap-elide.ll

This file was added.

				; Tests that the dynamic allocation and deallocation of the coroutine frame is
				; elided and any tail calls referencing the coroutine frame has the tail
				; call attribute removed.
				; RUN: opt < %s -S -inline -coro-elide -instsimplify -simplifycfg \| FileCheck %s

				declare void @print(i32) nounwind

				%f.frame = type {i32}

				declare void @bar(i8*)

				declare fastcc void @f.resume(%f.frame*)
				declare fastcc void @f.destroy(%f.frame*)

				declare void @may_throw()
				declare i8* @CustomAlloc(i32)
				declare void @CustomFree(i8*)

				@f.resumers = internal constant
				[2 x void (%f.frame)] [void (%f.frame) @f.resume, void (%f.frame) @f.destroy]

				; a coroutine start function
				define i8* @f() personality i8* null {
				entry:
				%elide = call i8* @llvm.coro.alloc()
				%need.dyn.alloc = icmp ne i8* %elide, null
				br i1 %need.dyn.alloc, label %coro.begin, label %dyn.alloc
				dyn.alloc:
				%alloc = call i8* @CustomAlloc(i32 4)
				br label %coro.begin
				coro.begin:
				%phi = phi i8* [ %elide, %entry ], [ %alloc, %dyn.alloc ]
				%beg = call token @llvm.coro.begin(i8* %phi, i8* %elide, i32 0, i8* null,
				i8* bitcast ([2 x void (%f.frame)]* @f.resumers to i8*))
				%hdl = call i8* @llvm.coro.frame(token %beg)
				invoke void @may_throw()
				to label %ret unwind label %ehcleanup
				ret:
				ret i8* %hdl

				ehcleanup:
				%tok = cleanuppad within none []
				%mem = call i8* @llvm.coro.free(i8* %hdl)
				%need.dyn.free = icmp ne i8* %mem, null
				br i1 %need.dyn.free, label %dyn.free, label %if.end
				dyn.free:
				call void @CustomFree(i8* %mem)
				br label %if.end
				if.end:
				cleanupret from %tok unwind to caller
				}

				; CHECK-LABEL: @callResume(
				define void @callResume() {
				entry:
				; CHECK: alloca %f.frame
				; CHECK-NOT: coro.begin
				; CHECK-NOT: CustomAlloc
				; CHECK: call void @may_throw()
				%hdl = call i8* @f()

				; Need to remove 'tail' from the first call to @bar
				; CHECK-NOT: tail call void @bar(
				; CHECK: call void @bar(
				tail call void @bar(i8* %hdl)
				; CHECK: tail call void @bar(
				tail call void @bar(i8* null)

				; CHECK-NEXT: call fastcc void bitcast (void (%f.frame) @f.resume to void (i8))(i8* %vFrame)
				%0 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 0)
				%1 = bitcast i8* %0 to void (i8)
				call fastcc void %1(i8* %hdl)

				; CHECK-NEXT: call fastcc void bitcast (void (%f.frame) @f.destroy to void (i8))(i8* %vFrame)
				%2 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 1)
				%3 = bitcast i8* %2 to void (i8)
				call fastcc void %3(i8* %hdl)

				; CHECK-NEXT: ret void
				ret void
				}

				; a coroutine start function (cannot elide heap alloc, due to second argument to
				; coro.begin not pointint to coro.alloc)
				define i8* @f_no_elision() personality i8* null {
				entry:
				%alloc = call i8* @CustomAlloc(i32 4)
				%beg = call token @llvm.coro.begin(i8* %alloc, i8* null, i32 0, i8* null,
				i8* bitcast ([2 x void (%f.frame)]* @f.resumers to i8*))
				%hdl = call i8* @llvm.coro.frame(token %beg)
				ret i8* %hdl
				}

				; CHECK-LABEL: @callResume_no_elision(
				define void @callResume_no_elision() {
				entry:
				; CHECK: call i8* @CustomAlloc(
				%hdl = call i8* @f_no_elision()

				; Tail call should remain tail calls
				; CHECK: tail call void @bar(
				tail call void @bar(i8* %hdl)
				; CHECK: tail call void @bar(
				tail call void @bar(i8* null)

				; CHECK-NEXT: call fastcc void bitcast (void (%f.frame) @f.resume to void (i8))(i8*
				%0 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 0)
				%1 = bitcast i8* %0 to void (i8)
				call fastcc void %1(i8* %hdl)

				; CHECK-NEXT: call fastcc void bitcast (void (%f.frame) @f.destroy to void (i8))(i8*
				%2 = call i8* @llvm.coro.subfn.addr(i8* %hdl, i8 1)
				%3 = bitcast i8* %2 to void (i8)
				call fastcc void %3(i8* %hdl)

				; CHECK-NEXT: ret void
				ret void
				}


				declare i8* @llvm.coro.alloc()
				declare i8* @llvm.coro.free(i8*)
				declare token @llvm.coro.begin(i8, i8, i32, i8, i8)
				declare i8* @llvm.coro.frame(token)
				declare i8* @llvm.coro.subfn.addr(i8*, i8)

test/Transforms/Coroutines/restart-trigger.ll

	; Verifies that restart trigger forces IPO pipelines restart and the same			; Verifies that restart trigger forces IPO pipelines restart and the same
	; coroutine is looked at by CoroSplit pass twice.			; coroutine is looked at by CoroSplit pass twice.
	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: opt < %s -S -O0 -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck %s			; RUN: opt < %s -S -O0 -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck %s
	; RUN: opt < %s -S -O1 -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck %s			; RUN: opt < %s -S -O1 -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck %s

	; CHECK: CoroSplit: Processing coroutine 'f' state: 0			; CHECK: CoroSplit: Processing coroutine 'f' state: 0
	; CHECK-NEXT: CoroSplit: Processing coroutine 'f' state: 1			; CHECK-NEXT: CoroSplit: Processing coroutine 'f' state: 1

	declare i8* @llvm.coro.begin(i8, i32, i8, i8*)			declare token @llvm.coro.begin(i8, i8, i32, i8, i8)

	; a coroutine start function			; a coroutine start function
	define i8* @f() {			define void @f() {
	entry:			call token @llvm.coro.begin(i8* null, i8* null, i32 0, i8* null, i8* null)
	%hdl = call i8* @llvm.coro.begin(i8* null, i32 0, i8* null, i8* null)			ret void
	ret i8* %hdl
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[Coroutines] Part 6: Elide dynamic allocation of a coroutine frame when possible
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 67521

docs/Coroutines.rst

include/llvm/IR/Intrinsics.td

lib/Transforms/Coroutines/CoroElide.cpp

lib/Transforms/Coroutines/CoroInstr.h

lib/Transforms/Coroutines/CoroInternal.h

lib/Transforms/Coroutines/Coroutines.cpp

test/Transforms/Coroutines/coro-elide.ll

test/Transforms/Coroutines/coro-heap-elide.ll

test/Transforms/Coroutines/restart-trigger.ll

This is an archive of the discontinued LLVM Phabricator instance.

[Coroutines] Part 6: Elide dynamic allocation of a coroutine frame when possibleClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 67521

docs/Coroutines.rst

include/llvm/IR/Intrinsics.td

lib/Transforms/Coroutines/CoroElide.cpp

lib/Transforms/Coroutines/CoroInstr.h

lib/Transforms/Coroutines/CoroInternal.h

lib/Transforms/Coroutines/Coroutines.cpp

test/Transforms/Coroutines/coro-elide.ll

test/Transforms/Coroutines/coro-heap-elide.ll

test/Transforms/Coroutines/restart-trigger.ll

[Coroutines] Part 6: Elide dynamic allocation of a coroutine frame when possible
ClosedPublic