This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGCoroutine.cpp
-
test/CodeGenCoroutines/
-
CodeGenCoroutines/
-
coro-alloc.cpp
-
coro-gro-nrvo.cpp
-
llvm/
-
docs/
-
Coroutines.rst
-
include/llvm/IR/
-
llvm/
-
IR/
-
Intrinsics.td
-
lib/Transforms/Coroutines/
-
Transforms/
-
Coroutines/
-
CoroFrame.cpp
-
CoroInstr.h
-
CoroInternal.h
-
CoroSplit.cpp
-
Coroutines.cpp
-
test/Transforms/Coroutines/
-
Transforms/
-
Coroutines/
-
coro-alloca-overalign.ll

Differential D106248

[Coroutines] Overalign coroutine frame when frame alignment exceeds the alignment limit
AbandonedPublic

Authored by ChuanqiXu on Jul 18 2021, 7:23 PM.

Download Raw Diff

Details

Reviewers

ychen
lxfind
rjmccall

Summary

This is an alternative to D97915.

Background: The alignment for ::operator new(size_t) would be 16 generally, which would be used to allocate memories for coroutine frame. But if there is variables requiring more alignment in the frame, it may be possible that the variable may not get align correctly. One solution would be searching ::operator new(size_t, align_val_t) in the front end. But it requires the support from the language side. Another solution is to do over align by the compiler. By allocate llvm.coro.size() + frame_align - new_align, we could make sure that the frame must could get aligned correctly in the over allocated storage. To avoid memory leak, we need one extra fields in the frame to store the address of allocated storage.

The main differences for this diff with D97915 are:

Add only one extra optional coroutine intrinsics in switch style.
Make the main work to over-align coroutine frame in CoroFrame.cpp.

The first point may be more important. Since coroutine language intrinsics is a language itself. Although it looks like that the switch style is used by C++ coroutine only, from the design it could and should be used by other languages easily. So less the intrinsics we add, the more clear the coroutine intrinsics would be. And I guess the extra intrinsic added in this diff may be more easy for user to use. If the user need to over align the frame due to limited allocator's alignment, it could emit one coro.overalign for the allocated memory and set the corresponding limited alignment. And every thing would be the same if the user doesn't need it.

The second point makes this diff more easy to review and edit.

Test Plan: check-llvm, https://gcc.godbolt.org/z/rGzaco, https://gcc.godbolt.org/z/W3ocj778M

Another question: if it is possible that we add the tests into the llvm? It looks like the test and unit test can't check the runtime behavior. I know there are slow tests in CI system, how could I edit that?

The reason why that I don't support ::operator new(size_t, align_t) is that I am confused about the detailed semantics. For example, it the user provides two allocator, which one should be chosen? If we think it could be chosen by the compiler. Another question is that, do we need to consider both of the allocator if the user doesn't provide two of them? For example, if the user provides promise_type::operator new(size_t) only, need we consider ::operator new(size_t, align_t)? Or if the user provides promise_type::new(size_t, align_t), do we still need to consider ::operator new(size_t)? I feel there must be decisions to make. I know it is easy to make a decision. But I think it is hard to get in consensus. So I don't support operator new(size_t, align_t).

But I think it should not be hard to support them based on this diff. For example, let's define the two expression:

Expr *AlignedAllocator;
Expr *AlignedDeallocator;

Then we need to initialize them in the Sema part. And in the CodeGen part, we could do:

Alloc:         ; controlled by `llvm.coro.alloc`
   %aligned = call i1 @llvm.coro.aligned()
   br %aligned, label %align_alloc, label %normal_alloc 
; ...
Dealloc:     ; controlled by the null ness of  `llvm.coro.free`
   br %aligned, label %align_dealloc, label %normal_dealloc

Then in the middle end, we could replace the value for llvm.coro.aligned simply.

Diff Detail

Unit TestsFailed

	Time	Test
	2,910 ms	x64 debian > libarcher.critical::critical.c
	2,700 ms	x64 debian > libarcher.races::critical-unrelated.c
	2,720 ms	x64 debian > libarcher.races::lock-nested-unrelated.c
	2,510 ms	x64 debian > libarcher.races::lock-unrelated.c
	2,670 ms	x64 debian > libarcher.races::parallel-simple.c
		View Full Test Results (17 Failed)

Event Timeline

ChuanqiXu created this revision.Jul 18 2021, 7:23 PM

Herald added a subscriber: hiraditya. · View Herald TranscriptJul 18 2021, 7:23 PM

ChuanqiXu requested review of this revision.Jul 18 2021, 7:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2021, 7:23 PM

Herald added subscribers: llvm-commits, jdoerfert. · View Herald Transcript

ChuanqiXu edited the summary of this revision. (Show Details)Jul 18 2021, 8:02 PM

Harbormaster completed remote builds in B114764: Diff 359662.Jul 18 2021, 8:03 PM

Thanks for the patch. Does it make more sense to compare this with D100739 (The implementation details are slightly different but the idea is similar: generating code to handle the overaligned frame in the backend. )? Before actually reviewing this patch, I think a consensus is needed about where to implement this: LLVM or Clang. My preference would be Clang however I think implementing this in LLVM is simpler. I'd appreciate other reviewer's comments.

In D106248#2892459, @ychen wrote:

Thanks for the patch. Does it make more sense to compare this with D100739 (The implementation details are slightly different but the idea is similar: generating code to handle the overaligned frame in the backend. )?

Yeah, compared with D100739, the overall idea may be similar. The main overall difference may be the design for introduced intrinsics, llvm.coro.overalign in this diff and llvm.coro.size.aligned in D100739. The intrinsic llvm.coro.overalign emitted by the frontend returns the address of allocated space. It gives more control to users. What I mean is that the user may be available to touch the allocated space once coroutine frame and allocated space don't refer to the same thing anymore. And the design for llvm.coro.size and llvm.coro.size.aligned may be odd to me slightly. Since the LLVM coroutine intrinsics is language itself. I think it matters.

I think a consensus is needed about where to implement this: LLVM or Clang. My preference would be Clang however I think implementing this in LLVM is simpler. I'd appreciate other reviewer's comments.

My preference would be LLVM. First, it is more simpler. Second, it should be the responsibility in the middle end. Since Clang/LLVM coroutine leaves the work for manipulating the coroutine memories to the middle end by design. And I don't know what's the benefit we could get to implement it in clang. After all, we still need some work in the middle end.

avogelsgesang added a subscriber: avogelsgesang.Jul 28 2021, 3:36 AM

ChuanqiXu planned changes to this revision.Nov 15 2021, 2:25 AM

ChuanqiXu abandoned this revision.Nov 15 2021, 7:38 PM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGCoroutine.cpp

23 lines

test/

CodeGenCoroutines/

coro-alloc.cpp

7 lines

coro-gro-nrvo.cpp

47 lines

llvm/

docs/

Coroutines.rst

60 lines

include/

llvm/

IR/

Intrinsics.td

1 line

lib/

Transforms/

Coroutines/

104 lines

12 lines

2 lines

7 lines

14 lines

test/

Transforms/

Coroutines/

coro-alloca-overalign.ll

70 lines

Diff 359662

clang/lib/CodeGen/CGCoroutine.cpp

	Show First 20 Lines • Show All 541 Lines • ▼ Show 20 Lines

	void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {			void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
	auto *NullPtr = llvm::ConstantPointerNull::get(Builder.getInt8PtrTy());			auto *NullPtr = llvm::ConstantPointerNull::get(Builder.getInt8PtrTy());
	auto &TI = CGM.getContext().getTargetInfo();			auto &TI = CGM.getContext().getTargetInfo();
	unsigned NewAlign = TI.getNewAlign() / TI.getCharWidth();			unsigned NewAlign = TI.getNewAlign() / TI.getCharWidth();

	auto *EntryBB = Builder.GetInsertBlock();			auto *EntryBB = Builder.GetInsertBlock();
	auto *AllocBB = createBasicBlock("coro.alloc");			auto *AllocBB = createBasicBlock("coro.alloc");
				auto *AlignBB = createBasicBlock("coro.align");
	auto *InitBB = createBasicBlock("coro.init");			auto *InitBB = createBasicBlock("coro.init");
	auto *FinalBB = createBasicBlock("coro.final");			auto *FinalBB = createBasicBlock("coro.final");
	auto *RetBB = createBasicBlock("coro.ret");			auto *RetBB = createBasicBlock("coro.ret");

	auto *CoroId = Builder.CreateCall(			auto *CoroId = Builder.CreateCall(
	CGM.getIntrinsic(llvm::Intrinsic::coro_id),			CGM.getIntrinsic(llvm::Intrinsic::coro_id),
	{Builder.getInt32(NewAlign), NullPtr, NullPtr, NullPtr});			{Builder.getInt32(NewAlign), NullPtr, NullPtr, NullPtr});
	createCoroData(*this, CurCoro, CoroId);			createCoroData(*this, CurCoro, CoroId);
	CurCoro.Data->SuspendBB = RetBB;			CurCoro.Data->SuspendBB = RetBB;
	assert(ShouldEmitLifetimeMarkers &&			assert(ShouldEmitLifetimeMarkers &&
	"Must emit lifetime intrinsics for coroutines");			"Must emit lifetime intrinsics for coroutines");

	// Backend is allowed to elide memory allocations, to help it, emit			// Backend is allowed to elide memory allocations, to help it, emit
	// auto mem = coro.alloc() ? 0 : ... allocation code ...;			// auto mem = coro.alloc() ? 0 : ... allocation code ...;
	auto *CoroAlloc = Builder.CreateCall(			auto *CoroAlloc = Builder.CreateCall(
	CGM.getIntrinsic(llvm::Intrinsic::coro_alloc), {CoroId});			CGM.getIntrinsic(llvm::Intrinsic::coro_alloc), {CoroId});

	Builder.CreateCondBr(CoroAlloc, AllocBB, InitBB);			Builder.CreateCondBr(CoroAlloc, AllocBB, InitBB);

	EmitBlock(AllocBB);			EmitBlock(AllocBB);
	auto *AllocateCall = EmitScalarExpr(S.getAllocate());			auto *AllocateCall = EmitScalarExpr(S.getAllocate());
	auto *AllocOrInvokeContBB = Builder.GetInsertBlock();

	// Handle allocation failure if 'ReturnStmtOnAllocFailure' was provided.			// Handle allocation failure if 'ReturnStmtOnAllocFailure' was provided.
	if (auto *RetOnAllocFailure = S.getReturnStmtOnAllocFailure()) {			if (auto *RetOnAllocFailure = S.getReturnStmtOnAllocFailure()) {
	auto *RetOnFailureBB = createBasicBlock("coro.ret.on.failure");			auto *RetOnFailureBB = createBasicBlock("coro.ret.on.failure");

	// See if allocation was successful.			// See if allocation was successful.
	auto *NullPtr = llvm::ConstantPointerNull::get(Int8PtrTy);			auto *NullPtr = llvm::ConstantPointerNull::get(Int8PtrTy);
	auto *Cond = Builder.CreateICmpNE(AllocateCall, NullPtr);			auto *Cond = Builder.CreateICmpNE(AllocateCall, NullPtr);
	Builder.CreateCondBr(Cond, InitBB, RetOnFailureBB);			Builder.CreateCondBr(Cond, AlignBB, RetOnFailureBB);

	// If not, return OnAllocFailure object.			// If not, return OnAllocFailure object.
	EmitBlock(RetOnFailureBB);			EmitBlock(RetOnFailureBB);
	EmitStmt(RetOnAllocFailure);			EmitStmt(RetOnAllocFailure);
				} else {
				Builder.CreateBr(AlignBB);
	}			}
	else {
	Builder.CreateBr(InitBB);			EmitBlock(AlignBB);
	}			// Since the standard doesn't support the aligned allocator,
				// `operator new(size_t, align_t)`, if there is variable in frame
				// whose alignment requirement exceeds `std::max_align_t` may
				// not get aligned correctly.
				// Pass `std::max_align_t` to the middle end to make the compiler
				// to overalign more space to make the frame align correctly.
				auto *AlignedAllocateCall = Builder.CreateCall(
				CGM.getIntrinsic(llvm::Intrinsic::coro_overalign),
				{AllocateCall, llvm::ConstantInt::get(
				llvm::Type::getInt64Ty(getLLVMContext()), NewAlign)});

	EmitBlock(InitBB);			EmitBlock(InitBB);

	// Pass the result of the allocation to coro.begin.			// Pass the result of the allocation to coro.begin.
	auto *Phi = Builder.CreatePHI(VoidPtrTy, 2);			auto *Phi = Builder.CreatePHI(VoidPtrTy, 2);
	Phi->addIncoming(NullPtr, EntryBB);			Phi->addIncoming(NullPtr, EntryBB);
	Phi->addIncoming(AllocateCall, AllocOrInvokeContBB);			Phi->addIncoming(AlignedAllocateCall, AlignBB);
	auto *CoroBegin = Builder.CreateCall(			auto *CoroBegin = Builder.CreateCall(
	CGM.getIntrinsic(llvm::Intrinsic::coro_begin), {CoroId, Phi});			CGM.getIntrinsic(llvm::Intrinsic::coro_begin), {CoroId, Phi});
	CurCoro.Data->CoroBegin = CoroBegin;			CurCoro.Data->CoroBegin = CoroBegin;

	GetReturnObjectManager GroManager(*this, S);			GetReturnObjectManager GroManager(*this, S);
	GroManager.EmitGroAlloca();			GroManager.EmitGroAlloca();

	CurCoro.Data->CleanupJD = getJumpDestInCurrentScope(RetBB);			CurCoro.Data->CleanupJD = getJumpDestInCurrentScope(RetBB);
	▲ Show 20 Lines • Show All 168 Lines • Show Last 20 Lines

clang/test/CodeGenCoroutines/coro-alloc.cpp

	Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
	extern "C" void f0(global_new_delete_tag) {			extern "C" void f0(global_new_delete_tag) {
	// CHECK: %[[ID:.+]] = call token @llvm.coro.id(i32 16			// CHECK: %[[ID:.+]] = call token @llvm.coro.id(i32 16
	// CHECK: %[[NeedAlloc:.+]] = call i1 @llvm.coro.alloc(token %[[ID]])			// CHECK: %[[NeedAlloc:.+]] = call i1 @llvm.coro.alloc(token %[[ID]])
	// CHECK: br i1 %[[NeedAlloc]], label %[[AllocBB:.+]], label %[[InitBB:.+]]			// CHECK: br i1 %[[NeedAlloc]], label %[[AllocBB:.+]], label %[[InitBB:.+]]

	// CHECK: [[AllocBB]]:			// CHECK: [[AllocBB]]:
	// CHECK: %[[SIZE:.+]] = call i64 @llvm.coro.size.i64()			// CHECK: %[[SIZE:.+]] = call i64 @llvm.coro.size.i64()
	// CHECK: %[[MEM:.+]] = call noalias nonnull i8* @_Znwm(i64 %[[SIZE]])			// CHECK: %[[MEM:.+]] = call noalias nonnull i8* @_Znwm(i64 %[[SIZE]])
	// CHECK: br label %[[InitBB]]			// CHECK: br label %[[AlignedBB:.+]]

				// CHECK: [[AlignedBB]]:
				// CHECK: %[[ALIGNED_MEM:.+]] = call i8* @llvm.coro.overalign(i8* %[[MEM]]

	// CHECK: [[InitBB]]:			// CHECK: [[InitBB]]:
	// CHECK: %[[PHI:.+]] = phi i8* [ null, %{{.+}} ], [ %call, %[[AllocBB]] ]			// CHECK: %[[PHI:.+]] = phi i8* [ null, %{{.+}} ], [ %[[ALIGNED_MEM]], %[[AlignedBB]] ]
	// CHECK: %[[FRAME:.+]] = call i8* @llvm.coro.begin(token %[[ID]], i8* %[[PHI]])			// CHECK: %[[FRAME:.+]] = call i8* @llvm.coro.begin(token %[[ID]], i8* %[[PHI]])

	// CHECK: %[[MEM:.+]] = call i8* @llvm.coro.free(token %[[ID]], i8* %[[FRAME]])			// CHECK: %[[MEM:.+]] = call i8* @llvm.coro.free(token %[[ID]], i8* %[[FRAME]])
	// CHECK: %[[NeedDealloc:.+]] = icmp ne i8* %[[MEM]], null			// CHECK: %[[NeedDealloc:.+]] = icmp ne i8* %[[MEM]], null
	// CHECK: br i1 %[[NeedDealloc]], label %[[FreeBB:.+]], label %[[Afterwards:.+]]			// CHECK: br i1 %[[NeedDealloc]], label %[[FreeBB:.+]], label %[[Afterwards:.+]]

	// CHECK: [[FreeBB]]:			// CHECK: [[FreeBB]]:
	// CHECK: call void @_ZdlPv(i8* %[[MEM]])			// CHECK: call void @_ZdlPv(i8* %[[MEM]])
	▲ Show 20 Lines • Show All 180 Lines • Show Last 20 Lines

clang/test/CodeGenCoroutines/coro-gro-nrvo.cpp

Show All 30 Lines	struct coro {
coro(coro const&);		coro(coro const&);
struct Impl;		struct Impl;
Impl *impl;		Impl *impl;
};		};

// Verify that the NRVO is applied to the Gro object.		// Verify that the NRVO is applied to the Gro object.
// CHECK-LABEL: define{{.}} void @_Z1fi(%struct.coro noalias sret(%struct.coro) align 8 %agg.result, i32 %0)		// CHECK-LABEL: define{{.}} void @_Z1fi(%struct.coro noalias sret(%struct.coro) align 8 %agg.result, i32 %0)
coro f(int) {		coro f(int) {
// CHECK: %call = call noalias nonnull i8* @_Znwm(		// CHECK: %call = call noalias nonnull i8* @_Znwm(
// CHECK-NEXT: br label %[[CoroInit:.*]]		// CHECK-NEXT: br label %[[CoroAlign:.*]]
		// CHECK: [[CoroAlign:.*]]:
		// CHECK-NEXT: call i8* @llvm.coro.overalign(i8* %call,
		// CHECK-NEXT: br label %[[CoroInit:.+]]

// CHECK: {{.*}}[[CoroInit]]:		// CHECK: {{.*}}[[CoroInit]]:
// CHECK: store i1 false, i1* %gro.active		// CHECK: store i1 false, i1* %gro.active
// CHECK: call void @{{.get_return_objectEv}}(%struct.coro sret(%struct.coro) align 8 %agg.result		// CHECK: call void @{{.get_return_objectEv}}(%struct.coro sret(%struct.coro) align 8 %agg.result
// CHECK-NEXT: store i1 true, i1* %gro.active		// CHECK-NEXT: store i1 true, i1* %gro.active
co_return;		co_return;
}		}


template <class RetObject>		template <class RetObject>
struct promise_type_with_on_alloc_failure {		struct promise_type_with_on_alloc_failure {
static RetObject get_return_object_on_allocation_failure();		static RetObject get_return_object_on_allocation_failure();
RetObject get_return_object();		RetObject get_return_object();
Show All 9 Lines	struct coro_two {
struct Impl;		struct Impl;
Impl *impl;		Impl *impl;
};		};

// Verify that the NRVO is applied to the Gro object.		// Verify that the NRVO is applied to the Gro object.
// CHECK-LABEL: define{{.}} void @_Z1hi(%struct.coro_two noalias sret(%struct.coro_two) align 8 %agg.result, i32 %0)		// CHECK-LABEL: define{{.}} void @_Z1hi(%struct.coro_two noalias sret(%struct.coro_two) align 8 %agg.result, i32 %0)
coro_two h(int) {		coro_two h(int) {

// CHECK: %call = call noalias i8* @_ZnwmRKSt9nothrow_t		// CHECK: %call = call noalias i8* @_ZnwmRKSt9nothrow_t
// CHECK-NEXT: %[[CheckNull:.]] = icmp ne i8 %call, null		// CHECK-NEXT: %[[CheckNull:.]] = icmp ne i8 %call, null
// CHECK-NEXT: br i1 %[[CheckNull]], label %[[InitOnSuccess:.]], label %[[InitOnFailure:.]]		// CHECK-NEXT: br i1 %[[CheckNull]], label %[[AlignBB:.]], label %[[InitOnFailure:.]]

// CHECK: {{.*}}[[InitOnFailure]]:		// CHECK: {{.*}}[[InitOnFailure]]:
// CHECK-NEXT: call void @{{.get_return_object_on_allocation_failureEv}}(%struct.coro_two sret(%struct.coro_two) align 8 %agg.result		// CHECK-NEXT: call void @{{.get_return_object_on_allocation_failureEv}}(%struct.coro_two sret(%struct.coro_two) align 8 %agg.result
// CHECK-NEXT: br label %[[RetLabel:.*]]		// CHECK-NEXT: br label %[[RetLabel:.*]]

		// CHECK: [[AlignBB]]:
		// CHECK-NEXT: call i8* @llvm.coro.overalign
		// CHECK-NEXT: br label %[[InitOnSuccess:.+]]

// CHECK: {{.*}}[[InitOnSuccess]]:		// CHECK: {{.*}}[[InitOnSuccess]]:
// CHECK: store i1 false, i1* %gro.active		// CHECK: store i1 false, i1* %gro.active
// CHECK: call void @{{.get_return_objectEv}}(%struct.coro_two sret(%struct.coro_two) align 8 %agg.result		// CHECK: call void @{{.get_return_objectEv}}(%struct.coro_two sret(%struct.coro_two) align 8 %agg.result
// CHECK-NEXT: store i1 true, i1* %gro.active		// CHECK-NEXT: store i1 true, i1* %gro.active

// CHECK: [[RetLabel]]:		// CHECK: [[RetLabel]]:
// CHECK-NEXT: ret void		// CHECK-NEXT: ret void
co_return;		co_return;
}		}

llvm/docs/Coroutines.rst

Show First 20 Lines • Show All 928 Lines • ▼ Show 20 Lines	::

declare i32 @llvm.coro.size.i32()		declare i32 @llvm.coro.size.i32()
declare i64 @llvm.coro.size.i64()		declare i64 @llvm.coro.size.i64()

Overview:		Overview:
"""""""""		"""""""""

The '``llvm.coro.size``' intrinsic returns the number of bytes		The '``llvm.coro.size``' intrinsic returns the number of bytes
required to store a `coroutine frame`_. This is only supported for		required to store a `coroutine state`. This is only supported for
switched-resume coroutines.		switched-resume coroutines.

Arguments:		Arguments:
""""""""""		""""""""""

None		None

Semantics:		Semantics:
""""""""""		""""""""""

The `coro.size` intrinsic is lowered to a constant representing the size of		The `coro.size` intrinsic is lowered to a constant representing the size of
the coroutine frame.		the `coroutine state`. The `coroutine state` may contain coroutine frame and
		padding for alignment if needned.

.. _coro.begin:		.. _coro.begin:

'llvm.coro.begin' Intrinsic		'llvm.coro.begin' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
::		::

declare i8* @llvm.coro.begin(token <id>, i8* <mem>)		declare i8* @llvm.coro.begin(token <id>, i8* <mem>)
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines

Arguments:		Arguments:
""""""""""		""""""""""

The first argument is a token returned by a call to '``llvm.coro.id``'		The first argument is a token returned by a call to '``llvm.coro.id``'
identifying the coroutine.		identifying the coroutine.

The second argument is a pointer to the coroutine frame. This should be the same		The second argument is a pointer to the coroutine frame. This should be the same
pointer that was returned by prior `coro.begin` call.		pointer that was returned by prior `coro.begin` call even when overalign happens.
		The compiler would handle it to make sure this intrinsic returns a pointer to the
		coroutine state.

Example (custom deallocation function):		Example (custom deallocation function):
"""""""""""""""""""""""""""""""""""""""		"""""""""""""""""""""""""""""""""""""""

.. code-block:: llvm		.. code-block:: llvm

cleanup:		cleanup:
%mem = call i8* @llvm.coro.free(token %id, i8* %frame)		%mem = call i8* @llvm.coro.free(token %id, i8* %frame)
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	coro.alloc:
%frame.size = call i32 @llvm.coro.size()		%frame.size = call i32 @llvm.coro.size()
%alloc = call i8* @MyAlloc(i32 %frame.size)		%alloc = call i8* @MyAlloc(i32 %frame.size)
br label %coro.begin		br label %coro.begin

coro.begin:		coro.begin:
%phi = phi i8* [ null, %entry ], [ %alloc, %coro.alloc ]		%phi = phi i8* [ null, %entry ], [ %alloc, %coro.alloc ]
%frame = call i8* @llvm.coro.begin(token %id, i8* %phi)		%frame = call i8* @llvm.coro.begin(token %id, i8* %phi)

		.. _coro.overalign

		'llvm.coro.overalign' Intrinsic
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
		::

		declare i8* @llvm.coro.overalign(i8* allocated, i64 max_align)

		Overview:
		"""""""""

		The '``llvm.coro.overalign``' intrinsic returns the address of the coroutine frame
		which would satisfied the alignment required by `max_align`. This intrinsics is used
		when the alignment requirement of coroutine frame may exceed the max available align
		of corresponding allocator. In this case, compiler would allocate more space to
		satisfy the alignment requirement of coroutine frame. This is only supported for
		switched-resume coroutines.

		Arguments:
		""""""""""

		The first argument is a pointer to the allocated space. The second argument is the
		max available alignment for the allocator. The second argument must be an constant.

		Semantics:
		""""""""""

		A frontend should emit '``llvm.coro.overalign``' when the max available alignment of
		allocator is limited. Then the frontend should use the return value of this intrinsic
		as the address of coroutine frame.

		Example:
		""""""""

		.. code-block:: llvm

		entry:
		%id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
		%dyn.alloc.required = call i1 @llvm.coro.alloc(token %id)
		br i1 %dyn.alloc.required, label %coro.alloc, label %coro.begin

		coro.alloc:
		%frame.size = call i32 @llvm.coro.size()
		%alloc = call i8* @MyAlloc(i32 %frame.size)
		%aligned = call i8* @llvm.coro.overalign(%alloc, 16)
		br label %coro.begin

		coro.begin:
		%phi = phi i8* [ null, %entry ], [ %aligned, %coro.alloc ]
		%frame = call i8* @llvm.coro.begin(token %id, i8* %phi)

.. _coro.noop:		.. _coro.noop:

'llvm.coro.noop' Intrinsic		'llvm.coro.noop' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
::		::

declare i8* @llvm.coro.noop()		declare i8* @llvm.coro.noop()

▲ Show 20 Lines • Show All 700 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 1,227 Lines • ▼ Show 20 Lines	def int_coro_id_retcon : Intrinsic<[llvm_token_ty],
[llvm_i32_ty, llvm_i32_ty, llvm_ptr_ty,		[llvm_i32_ty, llvm_i32_ty, llvm_ptr_ty,
llvm_ptr_ty, llvm_ptr_ty, llvm_ptr_ty],		llvm_ptr_ty, llvm_ptr_ty, llvm_ptr_ty],
[]>;		[]>;
def int_coro_id_retcon_once : Intrinsic<[llvm_token_ty],		def int_coro_id_retcon_once : Intrinsic<[llvm_token_ty],
[llvm_i32_ty, llvm_i32_ty, llvm_ptr_ty,		[llvm_i32_ty, llvm_i32_ty, llvm_ptr_ty,
llvm_ptr_ty, llvm_ptr_ty, llvm_ptr_ty],		llvm_ptr_ty, llvm_ptr_ty, llvm_ptr_ty],
[]>;		[]>;
def int_coro_alloc : Intrinsic<[llvm_i1_ty], [llvm_token_ty], []>;		def int_coro_alloc : Intrinsic<[llvm_i1_ty], [llvm_token_ty], []>;
		def int_coro_overalign: Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_i64_ty], []>;
def int_coro_id_async : Intrinsic<[llvm_token_ty],		def int_coro_id_async : Intrinsic<[llvm_token_ty],
[llvm_i32_ty, llvm_i32_ty, llvm_i32_ty, llvm_ptr_ty],		[llvm_i32_ty, llvm_i32_ty, llvm_i32_ty, llvm_ptr_ty],
[]>;		[]>;
def int_coro_async_context_alloc : Intrinsic<[llvm_ptr_ty],		def int_coro_async_context_alloc : Intrinsic<[llvm_ptr_ty],
[llvm_ptr_ty, llvm_ptr_ty],		[llvm_ptr_ty, llvm_ptr_ty],
[]>;		[]>;
def int_coro_async_context_dealloc : Intrinsic<[],		def int_coro_async_context_dealloc : Intrinsic<[],
[llvm_ptr_ty],		[llvm_ptr_ty],
▲ Show 20 Lines • Show All 540 Lines • Show Last 20 Lines

llvm/lib/Transforms/Coroutines/CoroFrame.cpp

Show First 20 Lines • Show All 308 Lines • ▼ Show 20 Lines	struct AllocaInfo {
AllocaInst *Alloca;		AllocaInst *Alloca;
DenseMap<Instruction *, llvm::Optional<APInt>> Aliases;		DenseMap<Instruction *, llvm::Optional<APInt>> Aliases;
bool MayWriteBeforeCoroBegin;		bool MayWriteBeforeCoroBegin;
AllocaInfo(AllocaInst *Alloca,		AllocaInfo(AllocaInst *Alloca,
DenseMap<Instruction *, llvm::Optional<APInt>> Aliases,		DenseMap<Instruction *, llvm::Optional<APInt>> Aliases,
bool MayWriteBeforeCoroBegin)		bool MayWriteBeforeCoroBegin)
: Alloca(Alloca), Aliases(std::move(Aliases)),		: Alloca(Alloca), Aliases(std::move(Aliases)),
MayWriteBeforeCoroBegin(MayWriteBeforeCoroBegin) {}		MayWriteBeforeCoroBegin(MayWriteBeforeCoroBegin) {}

		Align getAlign() const { return Alloca->getAlign(); }
};		};
struct FrameDataInfo {		struct FrameDataInfo {
// All the values (that are not allocas) that needs to be spilled to the		// All the values (that are not allocas) that needs to be spilled to the
// frame.		// frame.
SpillInfo Spills;		SpillInfo Spills;
// Allocas contains all values defined as allocas that need to live in the		// Allocas contains all values defined as allocas that need to live in the
// frame.		// frame.
SmallVector<AllocaInfo, 8> Allocas;		SmallVector<AllocaInfo, 8> Allocas;
▲ Show 20 Lines • Show All 753 Lines • ▼ Show 20 Lines	SubProgram->replaceOperandWith(
7, (MDTuple::get(F.getContext(), RetainedNodesVec)));		7, (MDTuple::get(F.getContext(), RetainedNodesVec)));
}		}

DBuilder.insertDeclare(Shape.FramePtr, FrameDIVar,		DBuilder.insertDeclare(Shape.FramePtr, FrameDIVar,
DBuilder.createExpression(), DILoc,		DBuilder.createExpression(), DILoc,
Shape.FramePtr->getNextNode());		Shape.FramePtr->getNextNode());
}		}

		/// Handle the case that the alignment requirement of the coroutine frame
		/// exceeds the max available alignment of the allocator, which specified
		/// in `@llvm.coro.overalign`.
		static AllocaInst *tryOveralignFrame(coro::Shape &Shape, Align FrameAlign) {
		assert(Shape.ABI == coro::ABI::Switch);
		auto *AlignedAlloc = Shape.SwitchLowering.AlignedAlloc;
		// A frontend is allowed not to emit `@llvm.coro.overalign` if it feels
		// unneeded.
		if (!AlignedAlloc)
		return nullptr;

		ConstantInt *MaxAlignment =
		dyn_cast<ConstantInt>(AlignedAlloc->getOperand(1));
		if (!MaxAlignment)
		report_fatal_error(
		"The align value passed to @llvm.coro.overalign is not constant.\n");
		if (!isPowerOf2_64(MaxAlignment->getZExtValue()))
		report_fatal_error(
		"The align value Passed to @llvm.coro.overalign is not power of 2.\n");

		if (FrameAlign.value() <= MaxAlignment->getZExtValue())
		return nullptr;

		Shape.SwitchLowering.OveralignedSpace =
		FrameAlign.value() - MaxAlignment->getZExtValue();

		auto *CB = Shape.CoroBegin;
		LLVMContext &C = CB->getContext();
		IRBuilder<> Builder(C);
		Function &F = *CB->getParent()->getParent();

		Builder.SetInsertPoint(F.getEntryBlock().getFirstNonPHIOrDbgOrLifetime());
		// Alloca to store the allocated address to make sure we could delete it
		// correctly.
		auto *RawFrameAddrAlloca = Builder.CreateAlloca(llvm::Type::getInt8PtrTy(C),
		nullptr, "RawFrameAddr");
		// To calculate the address for the frame to make it align correctly.
		// ```
		// (AllocatedAddr + Align - 1) & (~(Align - 1))
		// ```
		// given Align equals to 2^m, `& (~(Align - 1))` would make the lowest m bit
		// for (AllocatedAddr + Align - 1) to 0. And (AllocatedAddr + Align - 1) must
		// exceed the address that just fit.
		Builder.SetInsertPoint(AlignedAlloc->getNextNode());
		Value *Allocated = AlignedAlloc->getOperand(0);
		auto *AllocatedAddr =
		Builder.CreatePtrToInt(Allocated, llvm::Type::getInt64Ty(C));
		auto *AlignConstant =
		ConstantInt::get(llvm::Type::getInt64Ty(C), FrameAlign.value());
		auto *Mask = Builder.CreateSub(
		AlignConstant, ConstantInt::get(llvm::Type::getInt64Ty(C), 1), "mask");
		auto *Boundary = Builder.CreateAdd(AllocatedAddr, Mask);
		auto *InvertedMask = Builder.CreateNot(Mask);
		auto *FrameAddr = Builder.CreateAnd(Boundary, InvertedMask);
		/// NOTE: Would the int2ptr cast breaks the noalias attribute for the frame?
		auto *Frame = Builder.CreateIntToPtr(FrameAddr, llvm::Type::getInt8PtrTy(C),
		"CoroFrame");
		Builder.CreateStore(Allocated, RawFrameAddrAlloca);

		AlignedAlloc->replaceAllUsesWith(Frame);
		AlignedAlloc->eraseFromParent();
		Shape.SwitchLowering.AlignedAlloc = nullptr;

		SmallVector<CoroFreeInst *, 4> CoroFrees;
		for (User *U : Shape.CoroBegin->getId()->users())
		if (auto CF = dyn_cast<CoroFreeInst>(U))
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: 'auto CF' can be declared as 'auto CF' [llvm-qualified-auto] not useful Lint: Pre-merge checks:* clang-tidy: warning: 'auto CF' can be declared as 'auto *CF' [llvm-qualified-auto] [[https…
		CoroFrees.push_back(CF);

		/// Get the allocated address to make sure memory leak.
		for (CoroFreeInst *Free : CoroFrees) {
		Builder.SetInsertPoint(Free);
		auto *RawFrameAddr = Builder.CreateLoad(Frame->getType(),
		RawFrameAddrAlloca, "RawFrameAddr");
		Free->setOperand(1, RawFrameAddr);
		}

		return cast<AllocaInst>(RawFrameAddrAlloca);
		}

// Build a struct that will keep state for an active coroutine.		// Build a struct that will keep state for an active coroutine.
// struct f.frame {		// struct f.frame {
// ResumeFnTy ResumeFnAddr;		// ResumeFnTy ResumeFnAddr;
// ResumeFnTy DestroyFnAddr;		// ResumeFnTy DestroyFnAddr;
// int ResumeIndex;		// int ResumeIndex;
// ... promise (if present) ...		// ... promise (if present) ...
// ... spills ...		// ... spills ...
// };		// };
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	for (auto &S : FrameData.Spills) {
if (const Argument *A = dyn_cast<Argument>(S.first))		if (const Argument *A = dyn_cast<Argument>(S.first))
if (A->hasByValAttr())		if (A->hasByValAttr())
FieldType = A->getParamByValType();		FieldType = A->getParamByValType();
FieldIDType Id =		FieldIDType Id =
B.addField(FieldType, None, false /header/, true /IsSpillOfValue/);		B.addField(FieldType, None, false /header/, true /IsSpillOfValue/);
FrameData.setFieldIndex(S.first, Id);		FrameData.setFieldIndex(S.first, Id);
}		}

		if (Shape.ABI == coro::ABI::Switch && !FrameData.Allocas.empty()) {
		Align FrameAlign =
		std::max_element(
		FrameData.Allocas.begin(), FrameData.Allocas.end(),
		[](auto &A1, auto &A2) { return A1.getAlign() < A2.getAlign(); })
		->getAlign();
		AllocaInst *RawFrameAlloca = tryOveralignFrame(Shape, FrameAlign);
		if (RawFrameAlloca) {
		FrameData.Allocas.emplace_back(
		RawFrameAlloca, DenseMap<Instruction *, llvm::Optional<APInt>>{},
		/MayWriteBeforeCoroBegin=/true);
		FrameData.setFieldIndex(RawFrameAlloca,
		B.addFieldForAlloca(RawFrameAlloca));
		}
		}

B.finish(FrameTy);		B.finish(FrameTy);
FrameData.updateLayoutIndex(B);		FrameData.updateLayoutIndex(B);
Shape.FrameAlign = B.getStructAlign();		Shape.FrameAlign = B.getStructAlign();
Shape.FrameSize = B.getStructSize();		Shape.FrameSize = B.getStructSize();

switch (Shape.ABI) {		switch (Shape.ABI) {
case coro::ABI::Switch: {		case coro::ABI::Switch: {
// In the switch ABI, remember the switch-index field.		// In the switch ABI, remember the switch-index field.
auto IndexField = B.getLayoutField(*SwitchIndexFieldId);		auto IndexField = B.getLayoutField(*SwitchIndexFieldId);
Shape.SwitchLowering.IndexField = IndexField.LayoutFieldIndex;		Shape.SwitchLowering.IndexField = IndexField.LayoutFieldIndex;
Shape.SwitchLowering.IndexAlign = IndexField.Alignment.value();		Shape.SwitchLowering.IndexAlign = IndexField.Alignment.value();
Shape.SwitchLowering.IndexOffset = IndexField.Offset;		Shape.SwitchLowering.IndexOffset = IndexField.Offset;

// Also round the frame size up to a multiple of its alignment, as is		// Also round the frame size up to a multiple of its alignment, as is
// generally expected in C/C++.		// generally expected in C/C++.
Shape.FrameSize = alignTo(Shape.FrameSize, Shape.FrameAlign);		Shape.FrameSize = alignTo(Shape.FrameSize, Shape.FrameAlign);

		if (Shape.SwitchLowering.AlignedAlloc) {
		Shape.SwitchLowering.AlignedAlloc->replaceAllUsesWith(
		Shape.SwitchLowering.AlignedAlloc->getOperand(0));
		Shape.SwitchLowering.AlignedAlloc->eraseFromParent();
		Shape.SwitchLowering.AlignedAlloc = nullptr;
		}
break;		break;
}		}

// In the retcon ABI, remember whether the frame is inline in the storage.		// In the retcon ABI, remember whether the frame is inline in the storage.
case coro::ABI::Retcon:		case coro::ABI::Retcon:
case coro::ABI::RetconOnce: {		case coro::ABI::RetconOnce: {
auto Id = Shape.getRetconCoroId();		auto Id = Shape.getRetconCoroId();
Shape.RetconLowering.IsFrameInlineInStorage		Shape.RetconLowering.IsFrameInlineInStorage
▲ Show 20 Lines • Show All 1,572 Lines • Show Last 20 Lines

llvm/lib/Transforms/Coroutines/CoroInstr.h

Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	public:
static bool classof(const IntrinsicInst *I) {		static bool classof(const IntrinsicInst *I) {
return I->getIntrinsicID() == Intrinsic::coro_alloc;		return I->getIntrinsicID() == Intrinsic::coro_alloc;
}		}
static bool classof(const Value *V) {		static bool classof(const Value *V) {
return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
}		}
};		};

		/// This represents the llvm.coro.overalign instruction.
		class LLVM_LIBRARY_VISIBILITY CoroOveralignInst : public IntrinsicInst {
		public:
		// Methods to support type inquiry through isa, cast, and dyn_cast:
		static bool classof(const IntrinsicInst *I) {
		return I->getIntrinsicID() == Intrinsic::coro_overalign;
		}
		static bool classof(const Value *V) {
		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
		}
		};

/// This represents a common base class for llvm.coro.id instructions.		/// This represents a common base class for llvm.coro.id instructions.
class LLVM_LIBRARY_VISIBILITY AnyCoroIdInst : public IntrinsicInst {		class LLVM_LIBRARY_VISIBILITY AnyCoroIdInst : public IntrinsicInst {
public:		public:
CoroAllocInst *getCoroAlloc() {		CoroAllocInst *getCoroAlloc() {
for (User *U : users())		for (User *U : users())
if (auto *CA = dyn_cast<CoroAllocInst>(U))		if (auto *CA = dyn_cast<CoroAllocInst>(U))
return CA;		return CA;
return nullptr;		return nullptr;
▲ Show 20 Lines • Show All 625 Lines • Show Last 20 Lines

llvm/lib/Transforms/Coroutines/CoroInternal.h

Show First 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	struct LLVM_LIBRARY_VISIBILITY Shape {

/// This would only be true if optimization are enabled.		/// This would only be true if optimization are enabled.
bool ReuseFrameSlot;		bool ReuseFrameSlot;

struct SwitchLoweringStorage {		struct SwitchLoweringStorage {
SwitchInst *ResumeSwitch;		SwitchInst *ResumeSwitch;
AllocaInst *PromiseAlloca;		AllocaInst *PromiseAlloca;
BasicBlock *ResumeEntryBlock;		BasicBlock *ResumeEntryBlock;
		CoroOveralignInst *AlignedAlloc;
unsigned IndexField;		unsigned IndexField;
unsigned IndexAlign;		unsigned IndexAlign;
unsigned IndexOffset;		unsigned IndexOffset;
		unsigned OveralignedSpace;
bool HasFinalSuspend;		bool HasFinalSuspend;
};		};

struct RetconLoweringStorage {		struct RetconLoweringStorage {
Function *ResumePrototype;		Function *ResumePrototype;
Function *Alloc;		Function *Alloc;
Function *Dealloc;		Function *Dealloc;
BasicBlock *ReturnBlock;		BasicBlock *ReturnBlock;
▲ Show 20 Lines • Show All 143 Lines • Show Last 20 Lines

llvm/lib/Transforms/Coroutines/CoroSplit.cpp

Show First 20 Lines • Show All 1,067 Lines • ▼ Show 20 Lines	static void replaceFrameSize(coro::Shape &Shape) {
if (Shape.CoroSizes.empty())		if (Shape.CoroSizes.empty())
return;		return;

// In the same function all coro.sizes should have the same result type.		// In the same function all coro.sizes should have the same result type.
auto *SizeIntrin = Shape.CoroSizes.back();		auto *SizeIntrin = Shape.CoroSizes.back();
Module *M = SizeIntrin->getModule();		Module *M = SizeIntrin->getModule();
const DataLayout &DL = M->getDataLayout();		const DataLayout &DL = M->getDataLayout();
auto Size = DL.getTypeAllocSize(Shape.FrameTy);		auto Size = DL.getTypeAllocSize(Shape.FrameTy);
auto *SizeConstant = ConstantInt::get(SizeIntrin->getType(), Size);		// To make sure the allocated space is enough for coroutine frame to satisfy
		// the align requirement. See tryOveralignFrame in CoroFrame.cpp for details.
		auto *SizeConstant = ConstantInt::get(
		SizeIntrin->getType(), Size + (Shape.ABI == coro::ABI::Switch
		? Shape.SwitchLowering.OveralignedSpace
		: 0));

for (CoroSizeInst *CS : Shape.CoroSizes) {		for (CoroSizeInst *CS : Shape.CoroSizes) {
CS->replaceAllUsesWith(SizeConstant);		CS->replaceAllUsesWith(SizeConstant);
CS->eraseFromParent();		CS->eraseFromParent();
}		}
}		}

// Create a global constant array containing pointers to functions provided and		// Create a global constant array containing pointers to functions provided and
▲ Show 20 Lines • Show All 1,201 Lines • Show Last 20 Lines

llvm/lib/Transforms/Coroutines/Coroutines.cpp

Show First 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	void coro::replaceCoroFree(CoroIdInst *CoroId, bool Elide) {
SmallVector<CoroFreeInst *, 4> CoroFrees;		SmallVector<CoroFreeInst *, 4> CoroFrees;
for (User *U : CoroId->users())		for (User *U : CoroId->users())
if (auto CF = dyn_cast<CoroFreeInst>(U))		if (auto CF = dyn_cast<CoroFreeInst>(U))
CoroFrees.push_back(CF);		CoroFrees.push_back(CF);

if (CoroFrees.empty())		if (CoroFrees.empty())
return;		return;

Value *Replacement =
Elide ? ConstantPointerNull::get(Type::getInt8PtrTy(CoroId->getContext()))
: CoroFrees.front()->getFrame();

for (CoroFreeInst *CF : CoroFrees) {		for (CoroFreeInst *CF : CoroFrees) {
CF->replaceAllUsesWith(Replacement);		CF->replaceAllUsesWith(Elide ? ConstantPointerNull::get(
		Type::getInt8PtrTy(CoroId->getContext()))
		: CF->getFrame());
CF->eraseFromParent();		CF->eraseFromParent();
}		}
}		}

// FIXME: This code is stolen from CallGraph::addToCallGraph(Function *F), which		// FIXME: This code is stolen from CallGraph::addToCallGraph(Function *F), which
// happens to be private. It is better for this functionality exposed by the		// happens to be private. It is better for this functionality exposed by the
// CallGraph.		// CallGraph.
static void buildCGN(CallGraph &CG, CallGraphNode *Node) {		static void buildCGN(CallGraph &CG, CallGraphNode *Node) {
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines

// Collect "interesting" coroutine intrinsics.		// Collect "interesting" coroutine intrinsics.
void coro::Shape::buildFrom(Function &F) {		void coro::Shape::buildFrom(Function &F) {
bool HasFinalSuspend = false;		bool HasFinalSuspend = false;
size_t FinalSuspendIndex = 0;		size_t FinalSuspendIndex = 0;
clear(*this);		clear(*this);
SmallVector<CoroFrameInst *, 8> CoroFrames;		SmallVector<CoroFrameInst *, 8> CoroFrames;
SmallVector<CoroSaveInst *, 2> UnusedCoroSaves;		SmallVector<CoroSaveInst *, 2> UnusedCoroSaves;
		CoroOveralignInst *AlignedAlloc = nullptr;

for (Instruction &I : instructions(F)) {		for (Instruction &I : instructions(F)) {
if (auto II = dyn_cast<IntrinsicInst>(&I)) {		if (auto II = dyn_cast<IntrinsicInst>(&I)) {
switch (II->getIntrinsicID()) {		switch (II->getIntrinsicID()) {
default:		default:
continue;		continue;
case Intrinsic::coro_size:		case Intrinsic::coro_size:
CoroSizes.push_back(cast<CoroSizeInst>(II));		CoroSizes.push_back(cast<CoroSizeInst>(II));
break;		break;
case Intrinsic::coro_frame:		case Intrinsic::coro_frame:
CoroFrames.push_back(cast<CoroFrameInst>(II));		CoroFrames.push_back(cast<CoroFrameInst>(II));
break;		break;
		case Intrinsic::coro_overalign:
		AlignedAlloc = cast<CoroOveralignInst>(II);
		break;
case Intrinsic::coro_save:		case Intrinsic::coro_save:
// After optimizations, coro_suspends using this coro_save might have		// After optimizations, coro_suspends using this coro_save might have
// been removed, remember orphaned coro_saves to remove them later.		// been removed, remember orphaned coro_saves to remove them later.
if (II->use_empty())		if (II->use_empty())
UnusedCoroSaves.push_back(cast<CoroSaveInst>(II));		UnusedCoroSaves.push_back(cast<CoroSaveInst>(II));
break;		break;
case Intrinsic::coro_suspend_async: {		case Intrinsic::coro_suspend_async: {
auto *Suspend = cast<CoroSuspendAsyncInst>(II);		auto *Suspend = cast<CoroSuspendAsyncInst>(II);
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	void coro::Shape::buildFrom(Function &F) {
switch (auto IdIntrinsic = Id->getIntrinsicID()) {		switch (auto IdIntrinsic = Id->getIntrinsicID()) {
case Intrinsic::coro_id: {		case Intrinsic::coro_id: {
auto SwitchId = cast<CoroIdInst>(Id);		auto SwitchId = cast<CoroIdInst>(Id);
this->ABI = coro::ABI::Switch;		this->ABI = coro::ABI::Switch;
this->SwitchLowering.HasFinalSuspend = HasFinalSuspend;		this->SwitchLowering.HasFinalSuspend = HasFinalSuspend;
this->SwitchLowering.ResumeSwitch = nullptr;		this->SwitchLowering.ResumeSwitch = nullptr;
this->SwitchLowering.PromiseAlloca = SwitchId->getPromise();		this->SwitchLowering.PromiseAlloca = SwitchId->getPromise();
this->SwitchLowering.ResumeEntryBlock = nullptr;		this->SwitchLowering.ResumeEntryBlock = nullptr;
		this->SwitchLowering.AlignedAlloc = AlignedAlloc;
		this->SwitchLowering.OveralignedSpace = 0;

for (auto AnySuspend : CoroSuspends) {		for (auto AnySuspend : CoroSuspends) {
auto Suspend = dyn_cast<CoroSuspendInst>(AnySuspend);		auto Suspend = dyn_cast<CoroSuspendInst>(AnySuspend);
if (!Suspend) {		if (!Suspend) {
#ifndef NDEBUG		#ifndef NDEBUG
AnySuspend->dump();		AnySuspend->dump();
#endif		#endif
report_fatal_error("coro.id must be paired with coro.suspend");		report_fatal_error("coro.id must be paired with coro.suspend");
▲ Show 20 Lines • Show All 367 Lines • Show Last 20 Lines

llvm/test/Transforms/Coroutines/coro-alloca-overalign.ll

This file was added.

				; Tests that the frame could overalign correctly if the input contains coro.overalign
				; RUN: opt < %s -passes='cgscc(coro-split)' -S \| FileCheck %s

				define i8* @f(i1 %n) "coroutine.presplit"="1" {
				entry:
				%x = alloca i64, align 64
				%y = alloca i64, align 64
				%id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
				%size = call i32 @llvm.coro.size.i32()
				%alloc = call i8* @malloc(i32 %size)
				%aligned_alloc = call i8* @llvm.coro.overalign(i8* %alloc, i64 16)
				%hdl = call i8* @llvm.coro.begin(token %id, i8* %aligned_alloc)
				br i1 %n, label %flag_true, label %flag_false

				flag_true:
				%x.alias = bitcast i64* %x to i32*
				br label %merge

				flag_false:
				%y.alias = bitcast i64* %y to i32*
				br label %merge

				merge:
				%alias_phi = phi i32* [ %x.alias, %flag_true ], [ %y.alias, %flag_false ]
				%sp1 = call i8 @llvm.coro.suspend(token none, i1 false)
				switch i8 %sp1, label %suspend [i8 0, label %resume
				i8 1, label %cleanup]
				resume:
				call void @print(i32* %alias_phi)
				br label %cleanup

				cleanup:
				%mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
				call void @free(i8* %mem)
				br label %suspend

				suspend:
				call i1 @llvm.coro.end(i8* %hdl, i1 0)
				ret i8* %hdl
				}

				; CHECK: define i8* @f(i1 %n)
				; CHECK: entry:
				; CHECK: %[[MEM:.+]] = call i8* @malloc(i32 184)
				; CHECK-NEXT: %[[ADDR:.+]] = ptrtoint i8* %[[MEM]] to i64
				; CHECK-NEXT: %[[MASK:.+]] = add i64 %[[ADDR]], 63
				; CHECK-NEXT: %[[FRAME_ADDR:.+]] = and i64 %[[MASK]], -64
				; CHECK-NEXT: %[[FRAME:.+]] = inttoptr i64 %[[FRAME_ADDR]] to i8*
				; CHECK-NEXT: store i8* %[[MEM]], i8** %RawFrameAddr, align 8
				; CHECK-NEXT: %hdl = call noalias nonnull i8* @llvm.coro.begin(token %id, i8* %[[FRAME]])
				;
				; CHECK: cleanup:
				; CHECK: %[[RawFrameAddr:.+]] = load i8, i8* %RawFrameAddr.reload.addr, align 8
				; CHECK: %mem = call i8* @llvm.coro.free(token %id, i8* %[[RawFrameAddr]])

				declare i8* @llvm.coro.free(token, i8*)
				declare i32 @llvm.coro.size.i32()
				declare i8 @llvm.coro.suspend(token, i1)
				declare void @llvm.coro.resume(i8*)
				declare void @llvm.coro.destroy(i8*)

				declare token @llvm.coro.id(i32, i8, i8, i8*)
				declare i1 @llvm.coro.alloc(token)
				declare i8* @llvm.coro.overalign(i8*, i64)
				declare i8* @llvm.coro.begin(token, i8*)
				declare i1 @llvm.coro.end(i8*, i1)

				declare void @print(i32*)
				declare noalias i8* @malloc(i32)
				declare void @free(i8*)