This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/CodeGen/
-
lib/
-
CodeGen/
1/4
CGCoroutine.cpp
-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
-
Intrinsics.td
-
lib/Transforms/Coroutines/
-
Transforms/
-
Coroutines/
3/10
CoroEarly.cpp
-
CoroInternal.h
1/2
CoroSplit.cpp

Differential D100415

[Coroutines] Split coroutine during CoroEarly into an init and ramp function
Changes PlannedPublic

Authored by lxfind on Apr 13 2021, 3:26 PM.

Download Raw Diff

Details

Reviewers

rjmccall
junparser
ChuanqiXu
bruno
wenlei

Summary

(I haven't updated tests yet)

A coroutine has the following structure in LLVM IR:

entry:
  alloca ..
  %promise = alloca ...
  %0 = call token @llvm.coro.id(..., %promise
  %1 = call i1 @llvm.coro.alloc(token %0)
  br i1 %1, label %coro.alloc, label %coro.init

coro.alloc:
  %2 = call i64 @llvm.coro.size.i64()
  %call = call noalias nonnull i8* @_Znwm(i64 %2)
  br label %coro.init

coro.init:                                       ; preds = %coro.alloc, %entry
  %3 = phi i8* [ null, %entry ], [ %call, %coro.alloc ]
  %4 = call i8* @llvm.coro.begin(token %0, i8* %3)
  ...
  move parameters to stack alloca
  create promise object
  ...
  actual coroutine body
  ...

It uses coro.id to uniquely identify the coroutine (which also refers to the promise object), use coro.alloc to decide whether to create frame on the heap, and use coro.begin the mark the frame object.
After that, it always moves all parameters to stack (stored in allocas), create the promise object by calling its constructor.
Finally it emits the actual coroutine body code.

Having all of these in the same function creates problems: optimization passes blend the initialization code with the coroutine body and move code around, which latter violates some of the requirements by coroutines.
There are two examples:

Frame objects accessed before coro.begin: coro.begin returns the frame pointer, that is, the frame is only ready after coro.begin. If any value is used across coroutine suspension and needs to be put on the frame, they need to be accessed through the frame instead of alloca. This is easy if the value is first accessed after coro.begin: we can just replace all their references by a pointer to the frame. However if a value is accessed before coro.begin, but also need to live on the frame, we are in trouble. D66230 made an initial attempt to fix this, but it wasn't complete. I made the fix more robust in D86859, which introduced a lot of complexity to AllocaUseVisitor. The basic idea is that we track every alloca use (both explicit and implicit through aliases) before coro.begin, and if they are touched we copy them into the frame after coro.begin. This is however not bullet-proof. If there exists complicated phi nodes, we may end up having to copy every single alloca to the frame. This patch separate the code before coro.begin and after coro.begin, making it impossible for optimization passes to mess around. There can be no complicated access to the frame before frame creation.
Captured by-val parameter through MemCpyOptPass: https://bugs.llvm.org/show_bug.cgi?id=48857. To summarize the problem, in the coroutine IR, a first mem.copy copies a passed-by-value parameter to a local allloca, and latter (after a coroutine suspension) copies the local alloca to another local alloca. MemCpyOptPass merges them and turns the second copy to be copying directly from the parameter to the second local alloca. This will lead to crash because the passed-by-value parameter pointer would have died after the coroutine suspension. This patch separate the parameter copy code and the coroutine body, making this kind of optimizations impossible.

Overall, we want to split the coroutine as much as possible as early as possible to avoid any kind of violations of coroutine propertiers from optimization passes.

To split the coroutine early, this patch splits the coroutine right after parameter move during CoroEarly pass. Anything before remain in the original function (called init function), and the rest is put into a new function (called ramp function). It's done through 3 steps:

In CGCoroutine, we need to emit a few new intrinsic instructions that CoroEarly can use to correctly split the function. First of all, the parameter move should only happen once in the init function. To achieve this effect, a new intrinsic coro.init() is created that returns a boolean value. It will return true in the init function while false in the ramp function. This allows us to control the behavior difference between init and ramp. Secondly, we need a marker that tells CoroEarly pass that the init function part is done, and the rest belongs to the ramp function. This is achieved by a new intrinsic coro.init.end(). This essentially marks the splitting point in CoroEarly split. Finally, every alloca that's storing the parameter copies will be annotated with metadata, indicating that they are parameters and will be used in the ramp function. The same thing is done to the promise object. These should be the only allocas that need to be used across init and ramp function. Such metadata will allow us to properly tag these values during CoroEarly pass, so that they won't be DCEed even though they may not be used in the rest of init function.

In CoroEarly pass, if the coroutine is a switch-lowering coroutine (i.e. has a coro.id), we will split the coroutine. The split process works like this: we first go through every alloca that has metadata and generate calls to a new intrinsic coro.frame.get(). coro.frame.get() returns the pointer from the frame for the specific alloca. It captures the alloca, indicates whether it's a promise and has a unique ID. All uses of this alloca is then replaced with the value returned by coro.frame.get(); next the function is cloned into a new function (ramp function), which takes only one parameter, the coroutine frame. It then goes through all intrinsics in the new function, repalce coro.begin with the parameter, replace coro_init() with false, and also replaces coro_alloc() with false (only init function needs to alloca the frame); finally it goes through all intrinsics in the init function, replaces coro_init() with true, and generates a call to the ramp function at the location of coro.init.end(), which removes rest of the function.

In CoroSplit pass, the idea is to inline the ramp function back to the init function, so that we can reuse the existing CoroSplit logic. To do so, we need to introduce a few more CORO_PRESPLIT_ATTR to tag the different states of the init and ramp function. When the ramp function is ready to split, and when we are processing the init function, we inline ramp function into the init function. To do so, we replace every coro.frame.get() with the original alloca. It also sets the promise field of coro.id to the promise object. After inlining, we update CGSCC and delete the ramp function.

Examples:

Coroutine IR emitted from the Clang front-end will look like this:

define @foo(i32 %val...)
entry:
  %val1 = alloca i32, align 4, !coroutine_frame_alloca !2
  ...
  %promise = alloca... !coroutine_frame_alloca !5
  %0 = call token @llvm.coro.id(..., null...
  %1 = call i1 @llvm.coro.alloc(token %0)
  br i1 %1, label %coro.alloc, label %coro.begin

coro.alloc:
  %2 = call i64 @llvm.coro.size.i64()
  %call = call noalias nonnull i8* @_Znwm(i64 %2)
  br label %coro.begin

coro.begin:                                       ; preds = %coro.alloc, %entry
  %3 = phi i8* [ null, %entry ], [ %call, %coro.alloc ]
  %4 = call i8* @llvm.coro.begin(token %0, i8* %3)
  %5 = call i1 @llvm.coro.init()
  br i1 %5, label %coro.init, label %coro.init.ready

coro.init:                                        ; preds = %coro.begin
  %6 = bitcast i32* %val1 to i8*
  %7 = load i32, i32* %val.addr, align 4
  store i32 %7, i32* %val1, align 4
  ...
  call void @llvm.coro.init.end()
  br label %coro.init.ready

coro.init.ready:
  ...

...
!2 = !{i1 false, i32 0}
!5 = !{i1 true, i32 3}
...

CoroEarly will then split into two functions:

define @foo(i32 %val...)
entry:
  %val1 = alloca i32, align 4
  ...
  %promise = alloca...
  %0 = call token @llvm.coro.id(..., null...
  %1 = call i1 @llvm.coro.alloc(token %0)
  br i1 %1, label %coro.alloc, label %coro.begin

coro.alloc:
  %2 = call i64 @llvm.coro.size.i64()
  %call = call noalias nonnull i8* @_Znwm(i64 %2)
  br label %coro.begin

coro.begin:                                       ; preds = %coro.alloc, %entry
  %3 = phi i8* [ null, %entry ], [ %call, %coro.alloc ]
  %4 = call i8* @llvm.coro.begin(token %0, i8* %3)
  %5 = bitcast i32* %val1 to i8*
  %6 = call i8* @llvm.coro.frame.get(i8* %4, i8* %5, i1 false, i32 0)
  %7 = bitcast i8* %6 to i32*
  ...
  br label %coro.init

coro.init:                                        ; preds = %coro.begin
  %17 = bitcast i32* %7 to i8*
  %18 = load i32, i32* %val.addr, align 4
  store i32 %18, i32* %7, align 4
  ...
  call void @_Z1fi8MoveOnly11MoveAndCopy.ramp(i8* %4)
  ret void

define @foo.ramp(i8* %0)
entry:
  %val1 = alloca i32, align 4
  ...
  %promise = alloca...
  %1 = call token @llvm.coro.id(..., null...
  br label %coro.begin

coro.begin:
  %2 = bitcast i32* %val1 to i8*
  %3 = call i8* @llvm.coro.frame.get(i8* %0, i8* %2, i1 false, i32 0)
  %4 = bitcast i8* %3 to i32*
  br label %coro.init.ready

coro.init.ready:
  ...

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	440 ms	x64 debian > Clang.CodeGenCXX::ubsan-coroutines.cpp
	40 ms	x64 debian > Clang.CodeGenCoroutines::coro-always-inline.cpp
	60 ms	x64 debian > Clang.CodeGenCoroutines::coro-await-resume-eh.cpp
	50 ms	x64 debian > Clang.CodeGenCoroutines::coro-newpm-pipeline.cpp
	70 ms	x64 debian > Clang.CodeGenCoroutines::coro-params.cpp
		View Full Test Results (37 Failed)

Event Timeline

lxfind created this revision.Apr 13 2021, 3:26 PM

Herald added subscribers: ChuanqiXu, hoy, modimo and 2 others. · View Herald TranscriptApr 13 2021, 3:26 PM

lxfind requested review of this revision.Apr 13 2021, 3:26 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 13 2021, 3:26 PM

Herald added subscribers: llvm-commits, cfe-commits, jdoerfert. · View Herald Transcript

lxfind added reviewers: rjmccall, junparser, ChuanqiXu, bruno, wenlei.Apr 13 2021, 3:31 PM

lxfind edited the summary of this revision. (Show Details)

some cleanups

The overall idea looks good to me. Since this is a fundamental and large change, I need time to run it actually and look into details.
Noticed that this patch introduces new intrinsics and new concept init function, it may be needed to add them in the coroutines document.

Harbormaster completed remote builds in B98566: Diff 337270.Apr 13 2021, 8:28 PM

Harbormaster completed remote builds in B98581: Diff 337290.Apr 13 2021, 9:57 PM

It looks like this code may trigger assertion in CoroInstr.h for CoroIdInst::setCoroutineSelf.

Also, since this patch would enlarge the coroutine frame, it may affect the performance naturally. I believe it wouldn't really matter. I just find that we need coroutine benchmarks which seems missing now. It is really needed when we talk about the performance for coroutine. Although our intuition tell us it should be ok, we still need some data to approve us instead of imaging. I am not asking for benchmark or score for this patch. I just find there is no benchmark we can evaluate the performance for coroutine. I believe it would be a future direction.

Then, since there is hidden chance to decrease the performance, I wonder is it better to give an option to control how coroutine handle the parameters. We can use strategy in this patch by default. An option could help us in tuning the performance in the future.

I would try to look into the details for the code.

clang/lib/CodeGen/CGCoroutine.cpp
619	Did this vector have been used?
638	I wonder if it is better to document the metadata `coroutine_frame_alloca` in somewhere like the metadata `tbaa`.
646	It calls `coro.init.end` without calling `coro.init` in the front which looks odd.
llvm/lib/Transforms/Coroutines/CoroEarly.cpp
153	We need comment for the intention of the function.
156	It looks odd for several `{}` in Function to avoid name collision.
172	- VoidPt + VoidPtr
176	We need to document the semantics for `coro.frame.get`
193	Noticed that this patch deletes `F.addFnAttr(CORO_PRESPLIT_ATTR, UNPREPARED_FOR_SPLIT);` below, is it conflicting with `D100282` . I want to know if we still ned to add `Noinline` attribute once `D100282` checked in.
218	Why do we need to replace `coro.alloc` with 0 now? Replace `coro.alloc` with 0 implies we should allocate the frame in the stack. I think we can't know how should we allocate the frame now.
333	Should we give a another name for `splitRampFunction`? It may be surprising to see `split` in Coro-early pass instead of Coro-split pass. BTW, how do you think about create the ramp function in the CodeGen process of frontend?

@ChuanqiXu Thank you for the detailed review! Really appreciate it.
I agree we should create a coroutine benchmark at some point, ideally some realistic production-code driven benchmark. We can work on that in the future. For this patch, it's probably not worth it to hide it behind an option, for two reasons: 1) it would be extremely complicated, 2) most parameters would end up on the frame anyway 3) this patch actually doesn't force parameters to be put on the frame. Before frame creation, all the parameters are put back to allocas, the current alloca analysis and optimization still applies to them. So some parameters may actually end up not put on the frame. So I wouldn't expect this to increase frame size in most cases.

I will add documentation latter once the we all agree on the high-level idea/direction of this patch.

clang/lib/CodeGen/CGCoroutine.cpp
646	This path is conditionally guarded by `coro.init` alrady.
llvm/lib/Transforms/Coroutines/CoroEarly.cpp
193	Good question. For now they are somewhat redundant. We probably don't need to add NoInline here.
218	This is replacing it in the NewF (the cloned new ramp function). We only need to allocate the frame once, which will be done in the init function. So in the ramp function we can always skip it.
333	I thought about doing it in CodeGen. But it's really complicated to split functions in CodeGen.

In D100415#2691666, @lxfind wrote:

@ChuanqiXu Thank you for the detailed review! Really appreciate it.
I agree we should create a coroutine benchmark at some point, ideally some realistic production-code driven benchmark. We can work on that in the future. For this patch, it's probably not worth it to hide it behind an option, for two reasons: 1) it would be extremely complicated, 2) most parameters would end up on the frame anyway 3) this patch actually doesn't force parameters to be put on the frame. Before frame creation, all the parameters are put back to allocas, the current alloca analysis and optimization still applies to them. So some parameters may actually end up not put on the frame. So I wouldn't expect this to increase frame size in most cases.

I will add documentation latter once the we all agree on the high-level idea/direction of this patch.

Thanks for the disclaimer. Although I am not familiar with many details in this patch, the high-level idea looks good to me.

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
2231	I am not familiar with the policy in LLVM that how should we treat LegacyPass in trunk. I mean, are we responsible to update the LegacyPassManager?

lxfind added inline comments.Apr 18 2021, 10:43 AM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
2231	Yes I think so. I will deal with the legacypass latter.

ChuanqiXu mentioned this in D99067: [RFC] [Coroutines] Split Ramp Function.Apr 19 2021, 6:55 PM

Plan to add documentation, fix Legacy pass and address comments.

ychen added a subscriber: ychen.Apr 27 2021, 11:10 AM

ychen mentioned this in D101980: [RFC] [Coroutines] Put the contents of byval argument to the frame.May 12 2021, 1:26 AM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGCoroutine.cpp

55 lines

llvm/

include/

llvm/

IR/

Intrinsics.td

6 lines

lib/

Transforms/

Coroutines/

CoroEarly.cpp

127 lines

CoroInternal.h

4 lines

CoroSplit.cpp

106 lines

Diff 337290

clang/lib/CodeGen/CGCoroutine.cpp

	Show First 20 Lines • Show All 541 Lines • ▼ Show 20 Lines

	void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {			void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
	auto *NullPtr = llvm::ConstantPointerNull::get(Builder.getInt8PtrTy());			auto *NullPtr = llvm::ConstantPointerNull::get(Builder.getInt8PtrTy());
	auto &TI = CGM.getContext().getTargetInfo();			auto &TI = CGM.getContext().getTargetInfo();
	unsigned NewAlign = TI.getNewAlign() / TI.getCharWidth();			unsigned NewAlign = TI.getNewAlign() / TI.getCharWidth();

	auto *EntryBB = Builder.GetInsertBlock();			auto *EntryBB = Builder.GetInsertBlock();
	auto *AllocBB = createBasicBlock("coro.alloc");			auto *AllocBB = createBasicBlock("coro.alloc");
	auto *InitBB = createBasicBlock("coro.init");			auto *BeginBB = createBasicBlock("coro.begin");
	auto *FinalBB = createBasicBlock("coro.final");			auto *FinalBB = createBasicBlock("coro.final");
	auto *RetBB = createBasicBlock("coro.ret");			auto *RetBB = createBasicBlock("coro.ret");

	auto *CoroId = Builder.CreateCall(			auto *CoroId = Builder.CreateCall(
	CGM.getIntrinsic(llvm::Intrinsic::coro_id),			CGM.getIntrinsic(llvm::Intrinsic::coro_id),
	{Builder.getInt32(NewAlign), NullPtr, NullPtr, NullPtr});			{Builder.getInt32(NewAlign), NullPtr, NullPtr, NullPtr});
	createCoroData(*this, CurCoro, CoroId);			createCoroData(*this, CurCoro, CoroId);
	CurCoro.Data->SuspendBB = RetBB;			CurCoro.Data->SuspendBB = RetBB;
	assert(ShouldEmitLifetimeMarkers &&			assert(ShouldEmitLifetimeMarkers &&
	"Must emit lifetime intrinsics for coroutines");			"Must emit lifetime intrinsics for coroutines");

	// Backend is allowed to elide memory allocations, to help it, emit			// Backend is allowed to elide memory allocations, to help it, emit
	// auto mem = coro.alloc() ? 0 : ... allocation code ...;			// auto mem = coro.alloc() ? 0 : ... allocation code ...;
	auto *CoroAlloc = Builder.CreateCall(			auto *CoroAlloc = Builder.CreateCall(
	CGM.getIntrinsic(llvm::Intrinsic::coro_alloc), {CoroId});			CGM.getIntrinsic(llvm::Intrinsic::coro_alloc), {CoroId});

	Builder.CreateCondBr(CoroAlloc, AllocBB, InitBB);			Builder.CreateCondBr(CoroAlloc, AllocBB, BeginBB);

	EmitBlock(AllocBB);			EmitBlock(AllocBB);
	auto *AllocateCall = EmitScalarExpr(S.getAllocate());			auto *AllocateCall = EmitScalarExpr(S.getAllocate());
	auto *AllocOrInvokeContBB = Builder.GetInsertBlock();			auto *AllocOrInvokeContBB = Builder.GetInsertBlock();

	// Handle allocation failure if 'ReturnStmtOnAllocFailure' was provided.			// Handle allocation failure if 'ReturnStmtOnAllocFailure' was provided.
	if (auto *RetOnAllocFailure = S.getReturnStmtOnAllocFailure()) {			if (auto *RetOnAllocFailure = S.getReturnStmtOnAllocFailure()) {
	auto *RetOnFailureBB = createBasicBlock("coro.ret.on.failure");			auto *RetOnFailureBB = createBasicBlock("coro.ret.on.failure");

	// See if allocation was successful.			// See if allocation was successful.
	auto *NullPtr = llvm::ConstantPointerNull::get(Int8PtrTy);			auto *NullPtr = llvm::ConstantPointerNull::get(Int8PtrTy);
	auto *Cond = Builder.CreateICmpNE(AllocateCall, NullPtr);			auto *Cond = Builder.CreateICmpNE(AllocateCall, NullPtr);
	Builder.CreateCondBr(Cond, InitBB, RetOnFailureBB);			Builder.CreateCondBr(Cond, BeginBB, RetOnFailureBB);

	// If not, return OnAllocFailure object.			// If not, return OnAllocFailure object.
	EmitBlock(RetOnFailureBB);			EmitBlock(RetOnFailureBB);
	EmitStmt(RetOnAllocFailure);			EmitStmt(RetOnAllocFailure);
	}			}
	else {			else {
	Builder.CreateBr(InitBB);			Builder.CreateBr(BeginBB);
	}			}

	EmitBlock(InitBB);			EmitBlock(BeginBB);

	// Pass the result of the allocation to coro.begin.			// Pass the result of the allocation to coro.begin.
	auto *Phi = Builder.CreatePHI(VoidPtrTy, 2);			auto *Phi = Builder.CreatePHI(VoidPtrTy, 2);
	Phi->addIncoming(NullPtr, EntryBB);			Phi->addIncoming(NullPtr, EntryBB);
	Phi->addIncoming(AllocateCall, AllocOrInvokeContBB);			Phi->addIncoming(AllocateCall, AllocOrInvokeContBB);
	auto *CoroBegin = Builder.CreateCall(			auto *CoroBegin = Builder.CreateCall(
	CGM.getIntrinsic(llvm::Intrinsic::coro_begin), {CoroId, Phi});			CGM.getIntrinsic(llvm::Intrinsic::coro_begin), {CoroId, Phi});
	CurCoro.Data->CoroBegin = CoroBegin;			CurCoro.Data->CoroBegin = CoroBegin;

	GetReturnObjectManager GroManager(*this, S);			GetReturnObjectManager GroManager(*this, S);
	GroManager.EmitGroAlloca();			GroManager.EmitGroAlloca();

	CurCoro.Data->CleanupJD = getJumpDestInCurrentScope(RetBB);			CurCoro.Data->CleanupJD = getJumpDestInCurrentScope(RetBB);
	{			{
	ParamReferenceReplacerRAII ParamReplacer(LocalDeclMap);			ParamReferenceReplacerRAII ParamReplacer(LocalDeclMap);
	CodeGenFunction::RunCleanupsScope ResumeScope(*this);			CodeGenFunction::RunCleanupsScope ResumeScope(*this);
	EHStack.pushCleanup<CallCoroDelete>(NormalAndEHCleanup, S.getDeallocate());			EHStack.pushCleanup<CallCoroDelete>(NormalAndEHCleanup, S.getDeallocate());

				// Wrap around the parameter copy with a coro.init() check.
				// This will allows us to perform parameter copy in the init function, but
				// not in the ramp function.
				auto *InitBB = createBasicBlock("coro.init");
				auto *InitReadyBB = createBasicBlock("coro.init.ready");
				auto *CoroInit =
				Builder.CreateCall(CGM.getIntrinsic(llvm::Intrinsic::coro_init));
				Builder.CreateCondBr(CoroInit, InitBB, InitReadyBB);

				EmitBlock(InitBB);
				SmallVector<llvm::AllocaInst *, 4> FrameAllocas;
				ChuanqiXuUnsubmitted Not Done Reply Inline Actions Did this vector have been used? ChuanqiXu: Did this vector have been used?
	// Create parameter copies. We do it before creating a promise, since an			// Create parameter copies. We do it before creating a promise, since an
	// evolution of coroutine TS may allow promise constructor to observe			// evolution of coroutine TS may allow promise constructor to observe
	// parameter copies.			// parameter copies.
				int ID = 0;
	for (auto *PM : S.getParamMoves()) {			for (auto *PM : S.getParamMoves()) {
	EmitStmt(PM);			EmitStmt(PM);
	ParamReplacer.addCopy(cast<DeclStmt>(PM));			ParamReplacer.addCopy(cast<DeclStmt>(PM));
				llvm::AllocaInst *Alloca = cast<llvm::AllocaInst>(
				GetAddrOfLocalVar(cast<VarDecl>(cast<DeclStmt>(PM)->getSingleDecl()))
				.getPointer());
				Alloca->setMetadata(
				"coroutine_frame_alloca",
				llvm::MDNode::get(
				getLLVMContext(),
				{
				llvm::ConstantAsMetadata::get(
				Builder.getInt1(false)) /IsPromise/,
				llvm::ConstantAsMetadata::get(Builder.getInt32(ID++)),
				}));
				ChuanqiXuUnsubmitted Not Done Reply Inline Actions I wonder if it is better to document the metadata `coroutine_frame_alloca` in somewhere like the metadata `tbaa`. ChuanqiXu: I wonder if it is better to document the metadata `coroutine_frame_alloca` in somewhere like…
	// TODO: if(CoroParam(...)) need to surround ctor and dtor			// TODO: if(CoroParam(...)) need to surround ctor and dtor
	// for the copy, so that llvm can elide it if the copy is			// for the copy, so that llvm can elide it if the copy is
	// not needed.			// not needed.
	}			}

	EmitStmt(S.getPromiseDeclStmt());			EmitStmt(S.getPromiseDeclStmt());

				Builder.CreateCall(CGM.getIntrinsic(llvm::Intrinsic::coro_init_end));
				ChuanqiXuUnsubmitted Not Done Reply Inline Actions It calls `coro.init.end` without calling `coro.init` in the front which looks odd. ChuanqiXu: It calls `coro.init.end` without calling `coro.init` in the front which looks odd.
				lxfindAuthorUnsubmitted Done Reply Inline Actions This path is conditionally guarded by `coro.init` alrady. lxfind: This path is conditionally guarded by `coro.init` alrady.
				Builder.CreateBr(InitReadyBB);
				EmitBlock(InitReadyBB);

	Address PromiseAddr = GetAddrOfLocalVar(S.getPromiseDecl());			Address PromiseAddr = GetAddrOfLocalVar(S.getPromiseDecl());
	auto *PromiseAddrVoidPtr =			llvm::AllocaInst *PromiseAlloca =
	new llvm::BitCastInst(PromiseAddr.getPointer(), VoidPtrTy, "", CoroId);			cast<llvm::AllocaInst>(PromiseAddr.getPointer());
	// Update CoroId to refer to the promise. We could not do it earlier because
	// promise local variable was not emitted yet.			PromiseAlloca->setMetadata(
	CoroId->setArgOperand(1, PromiseAddrVoidPtr);			"coroutine_frame_alloca",
				llvm::MDNode::get(
				getLLVMContext(),
				{
				llvm::ConstantAsMetadata::get(
				Builder.getInt1(true)) /IsPromise/,
				llvm::ConstantAsMetadata::get(Builder.getInt32(ID++)),
				}));

	// Now we have the promise, initialize the GRO			// Now we have the promise, initialize the GRO
	GroManager.EmitGroInit();			GroManager.EmitGroInit();

	EHStack.pushCleanup<CallCoroEnd>(EHCleanup);			EHStack.pushCleanup<CallCoroEnd>(EHCleanup);

	CurCoro.Data->CurrentAwaitKind = AwaitKind::Init;			CurCoro.Data->CurrentAwaitKind = AwaitKind::Init;
	CurCoro.Data->ExceptionHandler = S.getExceptionHandler();			CurCoro.Data->ExceptionHandler = S.getExceptionHandler();
	▲ Show 20 Lines • Show All 125 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 1,268 Lines • ▼ Show 20 Lines

	// Coroutine Lowering Intrinsics. Used internally by coroutine passes.			// Coroutine Lowering Intrinsics. Used internally by coroutine passes.

	def int_coro_subfn_addr : Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_i8_ty],			def int_coro_subfn_addr : Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_i8_ty],
	[IntrReadMem, IntrArgMemOnly,			[IntrReadMem, IntrArgMemOnly,
	ReadOnly<ArgIndex<0>>,			ReadOnly<ArgIndex<0>>,
	NoCapture<ArgIndex<0>>]>;			NoCapture<ArgIndex<0>>]>;

				def int_coro_frame_get : Intrinsic<[llvm_ptr_ty],
				[llvm_ptr_ty, llvm_ptr_ty, llvm_i1_ty, llvm_i32_ty],
				[IntrNoMem]>;
				def int_coro_init: Intrinsic<[llvm_i1_ty], [], []>;
				def int_coro_init_end: Intrinsic<[], [], []>;

	///===-------------------------- Other Intrinsics --------------------------===//			///===-------------------------- Other Intrinsics --------------------------===//
	//			//
	def int_trap : Intrinsic<[], [], [IntrNoReturn, IntrCold]>,			def int_trap : Intrinsic<[], [], [IntrNoReturn, IntrCold]>,
	GCCBuiltin<"__builtin_trap">;			GCCBuiltin<"__builtin_trap">;
	def int_debugtrap : Intrinsic<[]>,			def int_debugtrap : Intrinsic<[]>,
	GCCBuiltin<"__builtin_debugtrap">;			GCCBuiltin<"__builtin_debugtrap">;
	def int_ubsantrap : Intrinsic<[], [llvm_i8_ty],			def int_ubsantrap : Intrinsic<[], [llvm_i8_ty],
	[IntrNoReturn, IntrCold, ImmArg<ArgIndex<0>>]>;			[IntrNoReturn, IntrCold, ImmArg<ArgIndex<0>>]>;
	▲ Show 20 Lines • Show All 414 Lines • Show Last 20 Lines

llvm/lib/Transforms/Coroutines/CoroEarly.cpp

//===- CoroEarly.cpp - Coroutine Early Function Pass ----------------------===//		//===- CoroEarly.cpp - Coroutine Early Function Pass ----------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Transforms/Coroutines/CoroEarly.h"		#include "llvm/Transforms/Coroutines/CoroEarly.h"
#include "CoroInternal.h"		#include "CoroInternal.h"
		#include "llvm/ADT/SetVector.h"
		#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/InstIterator.h"		#include "llvm/IR/InstIterator.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
		#include "llvm/IR/Type.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
		#include "llvm/Transforms/Utils/Cloning.h"
		#include "llvm/Transforms/Utils/Local.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "coro-early"		#define DEBUG_TYPE "coro-early"

namespace {		namespace {
// Created on demand if the coro-early pass has work to do.		// Created on demand if the coro-early pass has work to do.
class Lowerer : public coro::LowererBase {		class Lowerer : public coro::LowererBase {
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines
// NoDuplicate attribute will be removed from coro.begin otherwise, it will		// NoDuplicate attribute will be removed from coro.begin otherwise, it will
// interfere with inlining.		// interfere with inlining.
static void setCannotDuplicate(CoroIdInst *CoroId) {		static void setCannotDuplicate(CoroIdInst *CoroId) {
for (User *U : CoroId->users())		for (User *U : CoroId->users())
if (auto *CB = dyn_cast<CoroBeginInst>(U))		if (auto *CB = dyn_cast<CoroBeginInst>(U))
CB->setCannotDuplicate();		CB->setCannotDuplicate();
}		}

		static void splitRampFunction(Function &F) {
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions We need comment for the intention of the function. ChuanqiXu: We need comment for the intention of the function.
		Module *M = F.getParent();
		LLVMContext &C = M->getContext();
		{
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions It looks odd for several `{}` in Function to avoid name collision. ChuanqiXu: It looks odd for several `{}` in Function to avoid name collision.
		CoroBeginInst *CoroBegin = cast<CoroBeginInst>(
		&*llvm::find_if(instructions(F),
		[](Instruction &I) { return isa<CoroBeginInst>(&I); }));
		Instruction *InsertPoint = CoroBegin->getNextNode();

		for (Instruction &I : make_early_inc_range(instructions(F))) {
		auto *AI = dyn_cast<AllocaInst>(&I);
		if (!AI)
		continue;
		auto *MD = AI->getMetadata("coroutine_frame_alloca");
		if (!MD)
		continue;

		auto *IsPromise = cast<ConstantAsMetadata>(MD->getOperand(0))->getValue();
		auto *SlotID = cast<ConstantAsMetadata>(MD->getOperand(1))->getValue();
		auto *VoidPt =
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions - VoidPt + VoidPtr ChuanqiXu: ``` - VoidPt + VoidPtr ```
		new BitCastInst(AI, llvm::Type::getInt8PtrTy(C), "", InsertPoint);
		auto *FrameGet = CallInst::Create(
		Intrinsic::getDeclaration(M, Intrinsic::coro_frame_get),
		{CoroBegin, VoidPt, IsPromise, SlotID}, "", InsertPoint);
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions We need to document the semantics for `coro.frame.get` ChuanqiXu: We need to document the semantics for `coro.frame.get`
		auto *NewPtr = new BitCastInst(FrameGet, AI->getType(), "", InsertPoint);
		AI->replaceUsesWithIf(NewPtr,
		[&](Use &U) { return U.getUser() != VoidPt; });
		AI->setMetadata("coroutine_frame_alloca", nullptr);
		}
		}

		Function *NewF;
		{
		// Create the split ramp function, and clone.
		llvm::Type *NewFArgTypes[] = {llvm::Type::getInt8PtrTy(C)};
		auto newFuncType =
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: 'auto newFuncType' can be declared as 'auto newFuncType' [llvm-qualified-auto] not useful clang-tidy: warning: invalid case style for variable 'newFuncType' [readability-identifier-naming] not useful Lint: Pre-merge checks:* clang-tidy: warning: 'auto newFuncType' can be declared as 'auto *newFuncType' [llvm-qualified…
		FunctionType::get(F.getReturnType(), NewFArgTypes, false);
		NewF = Function::Create(newFuncType,
		GlobalValue::LinkageTypes::ExternalLinkage,
		F.getName() + ".ramp");
		NewF->addFnAttr(Attribute::NoInline);
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions Noticed that this patch deletes `F.addFnAttr(CORO_PRESPLIT_ATTR, UNPREPARED_FOR_SPLIT);` below, is it conflicting with `D100282` . I want to know if we still ned to add `Noinline` attribute once `D100282` checked in. ChuanqiXu: Noticed that this patch deletes `F.addFnAttr(CORO_PRESPLIT_ATTR, UNPREPARED_FOR_SPLIT);` below…
		lxfindAuthorUnsubmitted Done Reply Inline Actions Good question. For now they are somewhat redundant. We probably don't need to add NoInline here. lxfind: Good question. For now they are somewhat redundant. We probably don't need to add NoInline here.
		M->getFunctionList().push_back(NewF);
		ValueToValueMapTy VMap;
		for (Argument &A : F.args())
		VMap[&A] = UndefValue::get(A.getType());
		SmallVector<ReturnInst *, 4> Returns;
		CloneFunctionInto(NewF, &F, VMap, CloneFunctionChangeType::LocalChangesOnly,
		Returns);

		for (Instruction &I : make_early_inc_range(instructions(*NewF))) {
		auto *II = dyn_cast<IntrinsicInst>(&I);
		if (!II)
		continue;
		switch (II->getIntrinsicID()) {
		default:
		continue;
		case Intrinsic::coro_begin:
		II->replaceAllUsesWith(NewF->getArg(0));
		break;
		case Intrinsic::coro_init:
		II->replaceAllUsesWith(
		llvm::ConstantInt::get(llvm::Type::getInt1Ty(C), 0));
		break;
		case Intrinsic::coro_alloc:
		II->replaceAllUsesWith(
		llvm::ConstantInt::get(llvm::Type::getInt1Ty(C), 0));
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions Why do we need to replace `coro.alloc` with 0 now? Replace `coro.alloc` with 0 implies we should allocate the frame in the stack. I think we can't know how should we allocate the frame now. ChuanqiXu: Why do we need to replace `coro.alloc` with 0 now? Replace `coro.alloc` with 0 implies we…
		lxfindAuthorUnsubmitted Done Reply Inline Actions This is replacing it in the NewF (the cloned new ramp function). We only need to allocate the frame once, which will be done in the init function. So in the ramp function we can always skip it. lxfind: This is replacing it in the NewF (the cloned new ramp function). We only need to allocate the…
		break;
		}
		II->eraseFromParent();
		}
		removeUnreachableBlocks(*NewF);
		NewF->addFnAttr(CORO_PRESPLIT_ATTR, UNPREPARED_FOR_SPLIT_RAMP);
		}

		{
		// Process the init function.
		IntrinsicInst *CoroBegin = nullptr;
		IntrinsicInst *CoroInitEnd = nullptr;
		for (Instruction &I : make_early_inc_range(instructions(F))) {
		auto *II = dyn_cast<IntrinsicInst>(&I);
		if (!II)
		continue;
		switch (II->getIntrinsicID()) {
		default:
		break;
		case Intrinsic::coro_begin:
		CoroBegin = II;
		break;
		case Intrinsic::coro_init:
		II->replaceAllUsesWith(
		llvm::ConstantInt::get(llvm::Type::getInt1Ty(C), 1));
		II->eraseFromParent();
		break;
		case Intrinsic::coro_init_end:
		CoroInitEnd = II;
		break;
		}
		}
		assert(CoroInitEnd->getNextNode() ==
		CoroInitEnd->getParent()->getTerminator() &&
		"coro.init.end call should be at the end of the init block");
		CoroInitEnd->getNextNode()->eraseFromParent();
		CallInst *Ret = CallInst::Create(NewF, {CoroBegin}, "", CoroInitEnd);
		if (F.getReturnType()->isVoidTy())
		ReturnInst::Create(C, nullptr, CoroInitEnd);
		else
		ReturnInst::Create(C, Ret, CoroInitEnd);
		CoroInitEnd->eraseFromParent();
		removeUnreachableBlocks(F);
		F.addFnAttr(CORO_PRESPLIT_ATTR, DO_NOT_PROCESS);
		}
		}

bool Lowerer::lowerEarlyIntrinsics(Function &F) {		bool Lowerer::lowerEarlyIntrinsics(Function &F) {
bool Changed = false;		bool Changed = false;
CoroIdInst *CoroId = nullptr;		CoroIdInst *CoroId = nullptr;
SmallVector<CoroFreeInst *, 4> CoroFrees;		SmallVector<CoroFreeInst *, 4> CoroFrees;
for (auto IB = inst_begin(F), IE = inst_end(F); IB != IE;) {		for (auto IB = inst_begin(F), IE = inst_end(F); IB != IE;) {
Instruction &I = *IB++;		Instruction &I = *IB++;
if (auto *CB = dyn_cast<CallBase>(&I)) {		if (auto *CB = dyn_cast<CallBase>(&I)) {
switch (CB->getIntrinsicID()) {		switch (CB->getIntrinsicID()) {
Show All 18 Lines	if (auto *CB = dyn_cast<CallBase>(&I)) {
case Intrinsic::coro_noop:		case Intrinsic::coro_noop:
lowerCoroNoop(cast<IntrinsicInst>(&I));		lowerCoroNoop(cast<IntrinsicInst>(&I));
break;		break;
case Intrinsic::coro_id:		case Intrinsic::coro_id:
// Mark a function that comes out of the frontend that has a coro.id		// Mark a function that comes out of the frontend that has a coro.id
// with a coroutine attribute.		// with a coroutine attribute.
if (auto *CII = cast<CoroIdInst>(&I)) {		if (auto *CII = cast<CoroIdInst>(&I)) {
if (CII->getInfo().isPreSplit()) {		if (CII->getInfo().isPreSplit()) {
F.addFnAttr(CORO_PRESPLIT_ATTR, UNPREPARED_FOR_SPLIT);
setCannotDuplicate(CII);		setCannotDuplicate(CII);
CII->setCoroutineSelf();		CII->setCoroutineSelf();
CoroId = cast<CoroIdInst>(&I);		CoroId = cast<CoroIdInst>(&I);
}		}
}		}
break;		break;
case Intrinsic::coro_id_retcon:		case Intrinsic::coro_id_retcon:
case Intrinsic::coro_id_retcon_once:		case Intrinsic::coro_id_retcon_once:
Show All 14 Lines	if (auto *CB = dyn_cast<CallBase>(&I)) {
break;		break;
}		}
Changed = true;		Changed = true;
}		}
}		}
// Make sure that all CoroFree reference the coro.id intrinsic.		// Make sure that all CoroFree reference the coro.id intrinsic.
// Token type is not exposed through coroutine C/C++ builtins to plain C, so		// Token type is not exposed through coroutine C/C++ builtins to plain C, so
// we allow specifying none and fixing it up here.		// we allow specifying none and fixing it up here.
if (CoroId)		if (CoroId) {
for (CoroFreeInst *CF : CoroFrees)		for (CoroFreeInst *CF : CoroFrees)
CF->setArgOperand(0, CoroId);		CF->setArgOperand(0, CoroId);
		splitRampFunction(F);
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions Should we give a another name for `splitRampFunction`? It may be surprising to see `split` in Coro-early pass instead of Coro-split pass. BTW, how do you think about create the ramp function in the CodeGen process of frontend? ChuanqiXu: Should we give a another name for `splitRampFunction`? It may be surprising to see `split` in…
		lxfindAuthorUnsubmitted Done Reply Inline Actions I thought about doing it in CodeGen. But it's really complicated to split functions in CodeGen. lxfind: I thought about doing it in CodeGen. But it's really complicated to split functions in CodeGen.
		}
return Changed;		return Changed;
}		}

static bool declaresCoroEarlyIntrinsics(const Module &M) {		static bool declaresCoroEarlyIntrinsics(const Module &M) {
return coro::declaresIntrinsics(		return coro::declaresIntrinsics(
M, {"llvm.coro.id", "llvm.coro.id.retcon", "llvm.coro.id.retcon.once",		M, {"llvm.coro.id", "llvm.coro.id.retcon", "llvm.coro.id.retcon.once",
"llvm.coro.id.async", "llvm.coro.destroy", "llvm.coro.done",		"llvm.coro.id.async", "llvm.coro.destroy", "llvm.coro.done",
"llvm.coro.end", "llvm.coro.end.async", "llvm.coro.noop",		"llvm.coro.end", "llvm.coro.end.async", "llvm.coro.noop",
"llvm.coro.free", "llvm.coro.promise", "llvm.coro.resume",		"llvm.coro.free", "llvm.coro.promise", "llvm.coro.resume",
"llvm.coro.suspend"});		"llvm.coro.suspend"});
}		}

PreservedAnalyses CoroEarlyPass::run(Function &F, FunctionAnalysisManager &) {		PreservedAnalyses CoroEarlyPass::run(Function &F, FunctionAnalysisManager &) {
		if (F.getFnAttribute(CORO_PRESPLIT_ATTR).getValueAsString() ==
		UNPREPARED_FOR_SPLIT_RAMP)
		return PreservedAnalyses::all();

Module &M = *F.getParent();		Module &M = *F.getParent();
if (!declaresCoroEarlyIntrinsics(M) \|\| !Lowerer(M).lowerEarlyIntrinsics(F))		if (!declaresCoroEarlyIntrinsics(M) \|\| !Lowerer(M).lowerEarlyIntrinsics(F))
return PreservedAnalyses::all();		return PreservedAnalyses::all();

PreservedAnalyses PA;		PreservedAnalyses PA;
PA.preserveSet<CFGAnalyses>();		PA.preserveSet<CFGAnalyses>();
return PA;		return PA;
}		}
Show All 40 Lines

llvm/lib/Transforms/Coroutines/CoroInternal.h

	Show All 31 Lines
	// single function. It forces restart of the pipeline by inserting an indirect			// single function. It forces restart of the pipeline by inserting an indirect
	// call to an empty function "coro.devirt.trigger" which is devirtualized by			// call to an empty function "coro.devirt.trigger" which is devirtualized by
	// CoroElide pass that triggers a restart of the pipeline by CGPassManager.			// CoroElide pass that triggers a restart of the pipeline by CGPassManager.
	// When CoroSplit pass sees the same coroutine the second time, it splits it up,			// When CoroSplit pass sees the same coroutine the second time, it splits it up,
	// adds coroutine subfunctions to the SCC to be processed by IPO pipeline.			// adds coroutine subfunctions to the SCC to be processed by IPO pipeline.
	// Async lowering similarily triggers a restart of the pipeline after it has			// Async lowering similarily triggers a restart of the pipeline after it has
	// split the coroutine.			// split the coroutine.
	#define CORO_PRESPLIT_ATTR "coroutine.presplit"			#define CORO_PRESPLIT_ATTR "coroutine.presplit"
	#define UNPREPARED_FOR_SPLIT "0"			#define DO_NOT_PROCESS "0"
	#define PREPARED_FOR_SPLIT "1"			#define PREPARED_FOR_SPLIT "1"
	#define ASYNC_RESTART_AFTER_SPLIT "2"			#define ASYNC_RESTART_AFTER_SPLIT "2"
				#define UNPREPARED_FOR_SPLIT_RAMP "3"
				#define PREPARED_FOR_SPLIT_INIT "4"

	#define CORO_DEVIRT_TRIGGER_FN "coro.devirt.trigger"			#define CORO_DEVIRT_TRIGGER_FN "coro.devirt.trigger"

	namespace coro {			namespace coro {

	bool declaresIntrinsics(const Module &M,			bool declaresIntrinsics(const Module &M,
	const std::initializer_list<StringRef>);			const std::initializer_list<StringRef>);
	void replaceCoroFree(CoroIdInst *CoroId, bool Elide);			void replaceCoroFree(CoroIdInst *CoroId, bool Elide);
	▲ Show 20 Lines • Show All 236 Lines • Show Last 20 Lines

llvm/lib/Transforms/Coroutines/CoroSplit.cpp

Show First 20 Lines • Show All 2,043 Lines • ▼ Show 20 Lines
static void addPrepareFunction(const Module &M,		static void addPrepareFunction(const Module &M,
SmallVectorImpl<Function *> &Fns,		SmallVectorImpl<Function *> &Fns,
StringRef Name) {		StringRef Name) {
auto *PrepareFn = M.getFunction(Name);		auto *PrepareFn = M.getFunction(Name);
if (PrepareFn && !PrepareFn->use_empty())		if (PrepareFn && !PrepareFn->use_empty())
Fns.push_back(PrepareFn);		Fns.push_back(PrepareFn);
}		}

		static Function *getCoroInitFunction(Function &RampFunc) {
		StringRef RampName = RampFunc.getName();
		assert(RampName.endswith(".ramp") && "Ramp function must ends with .ramp");
		StringRef InitName = RampName.substr(0, RampName.size() - 5);
		return RampFunc.getParent()->getFunction(InitName);
		}

		static Function *inlineRampFunction(Function &F) {
		CallInst *RampCall = cast<CallInst>(
		&*llvm::find_if(instructions(F), [&](const Instruction &I) {
		if (const CallInst *CI = dyn_cast<CallInst>(&I))
		return CI->getCalledFunction()->getName().startswith(F.getName());
		return false;
		}));
		InlineFunctionInfo IFI;
		InlineFunction(*RampCall, IFI);

		SmallVector<IntrinsicInst *, 2> CoroIds;
		CoroBeginInst *CoroBegin = nullptr;
		SmallVector<IntrinsicInst *, 8> CoroFrameGets;
		for (Instruction &I : instructions(F)) {
		auto *II = dyn_cast<IntrinsicInst>(&I);
		if (!II)
		continue;
		switch (II->getIntrinsicID()) {
		default:
		break;
		case Intrinsic::coro_id:
		CoroIds.push_back(II);
		break;
		case Intrinsic::coro_begin:
		CoroBegin = cast<CoroBeginInst>(II);
		break;
		case Intrinsic::coro_frame_get:
		CoroFrameGets.push_back(II);
		break;
		}
		}
		assert(CoroIds.size() == 2 && "There must be two coro.id calls, from the "
		"init function and ramp function respectively");
		CoroIdInst *RealId = cast<CoroIdInst>(CoroBegin->getId());
		for (IntrinsicInst *I : CoroIds)
		if (I != RealId)
		I->replaceAllUsesWith(RealId);
		DenseMap<uint32_t, Instruction *> FrameSlotMap;
		for (IntrinsicInst *FrameGet : CoroFrameGets) {
		bool IsPromise = cast<ConstantInt>(FrameGet->getOperand(2))->getZExtValue();
		uint32_t SlotID =
		cast<ConstantInt>(FrameGet->getOperand(3))->getZExtValue();
		auto Itr = FrameSlotMap.find(SlotID);
		Instruction *Ptr;
		if (Itr == FrameSlotMap.end()) {
		Ptr = cast<Instruction>(FrameGet->getOperand(1));
		FrameSlotMap[SlotID] = Ptr;
		} else {
		Ptr = Itr->second;
		}
		FrameGet->replaceAllUsesWith(Ptr);
		FrameGet->eraseFromParent();
		if (IsPromise) {
		RealId->setOperand(1, new BitCastInst(Ptr->stripPointerCasts(),
		Ptr->getType(), "", RealId));
		}
		}

		return RampCall->getCalledFunction();
		}

PreservedAnalyses CoroSplitPass::run(LazyCallGraph::SCC &C,		PreservedAnalyses CoroSplitPass::run(LazyCallGraph::SCC &C,
CGSCCAnalysisManager &AM,		CGSCCAnalysisManager &AM,
LazyCallGraph &CG, CGSCCUpdateResult &UR) {		LazyCallGraph &CG, CGSCCUpdateResult &UR) {
// NB: One invariant of a valid LazyCallGraph::SCC is that it must contain a		// NB: One invariant of a valid LazyCallGraph::SCC is that it must contain a
// non-zero number of nodes, so we assume that here and grab the first		// non-zero number of nodes, so we assume that here and grab the first
// node's function's module.		// node's function's module.
Module &M = *C.begin()->getFunction().getParent();		Module &M = *C.begin()->getFunction().getParent();
auto &FAM =		auto &FAM =
Show All 17 Lines	if (Coroutines.empty() && PrepareFns.empty())
return PreservedAnalyses::all();		return PreservedAnalyses::all();

if (Coroutines.empty()) {		if (Coroutines.empty()) {
for (auto *PrepareFn : PrepareFns) {		for (auto *PrepareFn : PrepareFns) {
replaceAllPrepares(PrepareFn, CG, C);		replaceAllPrepares(PrepareFn, CG, C);
}		}
}		}

		SmallVector<Function *, 1> UnpreparedInitFuncs;
		SmallVector<Function *, 1> InlinedRampFuncs;
// Split all the coroutines.		// Split all the coroutines.
for (LazyCallGraph::Node *N : Coroutines) {		for (LazyCallGraph::Node *N : Coroutines) {
Function &F = N->getFunction();		Function &F = N->getFunction();
Attribute Attr = F.getFnAttribute(CORO_PRESPLIT_ATTR);		Attribute Attr = F.getFnAttribute(CORO_PRESPLIT_ATTR);
StringRef Value = Attr.getValueAsString();		StringRef Value = Attr.getValueAsString();
LLVM_DEBUG(dbgs() << "CoroSplit: Processing coroutine '" << F.getName()		LLVM_DEBUG(dbgs() << "CoroSplit: Processing coroutine '" << F.getName()
<< "' state: " << Value << "\n");		<< "' state: " << Value << "\n");
if (Value == UNPREPARED_FOR_SPLIT) {		if (Value == DO_NOT_PROCESS)
		continue;
		if (Value == UNPREPARED_FOR_SPLIT_RAMP) {
// Enqueue a second iteration of the CGSCC pipeline on this SCC.		// Enqueue a second iteration of the CGSCC pipeline on this SCC.
UR.CWorklist.insert(&C);		UR.CWorklist.insert(&C);
F.addFnAttr(CORO_PRESPLIT_ATTR, PREPARED_FOR_SPLIT);		// Once we allow the ramp function to be optimized, we will split
		// the init function directly and ignore the ramp function.
		F.addFnAttr(CORO_PRESPLIT_ATTR, DO_NOT_PROCESS);
		UnpreparedInitFuncs.push_back(getCoroInitFunction(F));
continue;		continue;
}		}
		if (Value == PREPARED_FOR_SPLIT_INIT) {
		Function *RampFunc = inlineRampFunction(F);
		InlinedRampFuncs.push_back(RampFunc);
		RampFunc->removeDeadConstantUsers();
		RampFunc->dropAllReferences();
		updateCGAndAnalysisManagerForCGSCCPass(CG, C, *N, AM, UR, FAM);
		}
F.removeFnAttr(CORO_PRESPLIT_ATTR);		F.removeFnAttr(CORO_PRESPLIT_ATTR);

SmallVector<Function *, 4> Clones;		SmallVector<Function *, 4> Clones;
const coro::Shape Shape = splitCoroutine(F, Clones, ReuseFrameSlot);		const coro::Shape Shape = splitCoroutine(F, Clones, ReuseFrameSlot);
updateCallGraphAfterCoroutineSplit(*N, Shape, Clones, C, CG, AM, UR, FAM);		updateCallGraphAfterCoroutineSplit(*N, Shape, Clones, C, CG, AM, UR, FAM);

if ((Shape.ABI == coro::ABI::Async \|\| Shape.ABI == coro::ABI::Retcon \|\|		if ((Shape.ABI == coro::ABI::Async \|\| Shape.ABI == coro::ABI::Retcon \|\|
Shape.ABI == coro::ABI::RetconOnce) &&		Shape.ABI == coro::ABI::RetconOnce) &&
!Shape.CoroSuspends.empty()) {		!Shape.CoroSuspends.empty()) {
// Run the CGSCC pipeline on the newly split functions.		// Run the CGSCC pipeline on the newly split functions.
// All clones will be in the same RefSCC, so choose a random clone.		// All clones will be in the same RefSCC, so choose a random clone.
UR.RCWorklist.insert(CG.lookupRefSCC(CG.get(*Clones[0])));		UR.RCWorklist.insert(CG.lookupRefSCC(CG.get(*Clones[0])));
}		}
}		}
		for (Function *F : UnpreparedInitFuncs)
		F->addFnAttr(CORO_PRESPLIT_ATTR, PREPARED_FOR_SPLIT_INIT);
		for (Function *DeadF : InlinedRampFuncs) {
		auto &DeadC = CG.lookupSCC(CG.lookup(*DeadF));
		FAM.clear(*DeadF, DeadF->getName());
		AM.clear(DeadC, DeadC.getName());
		auto &DeadRC = DeadC.getOuterRefSCC();
		CG.removeDeadFunction(*DeadF);

		// Mark the relevant parts of the call graph as invalid so we don't visit
		// them.
		UR.InvalidatedSCCs.insert(&DeadC);
		UR.InvalidatedRefSCCs.insert(&DeadRC);

		DeadF->getBasicBlockList().clear();
		M.getFunctionList().remove(DeadF);
		}

if (!PrepareFns.empty()) {		if (!PrepareFns.empty()) {
for (auto *PrepareFn : PrepareFns) {		for (auto *PrepareFn : PrepareFns) {
replaceAllPrepares(PrepareFn, CG, C);		replaceAllPrepares(PrepareFn, CG, C);
}		}
}		}

return PreservedAnalyses::none();		return PreservedAnalyses::none();
}		}

namespace {		namespace {

// We present a coroutine to LLVM as an ordinary function with suspension		// We present a coroutine to LLVM as an ordinary function with suspension
// points marked up with intrinsics. We let the optimizer party on the coroutine		// points marked up with intrinsics. We let the optimizer party on the coroutine
// as a single function for as long as possible. Shortly before the coroutine is		// as a single function for as long as possible. Shortly before the coroutine is
// eligible to be inlined into its callers, we split up the coroutine into parts		// eligible to be inlined into its callers, we split up the coroutine into parts
// corresponding to initial, resume and destroy invocations of the coroutine,		// corresponding to initial, resume and destroy invocations of the coroutine,
// add them to the current SCC and restart the IPO pipeline to optimize the		// add them to the current SCC and restart the IPO pipeline to optimize the
// coroutine subfunctions we extracted before proceeding to the caller of the		// coroutine subfunctions we extracted before proceeding to the caller of the
// coroutine.		// coroutine.
struct CoroSplitLegacy : public CallGraphSCCPass {		struct CoroSplitLegacy : public CallGraphSCCPass {
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions I am not familiar with the policy in LLVM that how should we treat LegacyPass in trunk. I mean, are we responsible to update the LegacyPassManager? ChuanqiXu: I am not familiar with the policy in LLVM that how should we treat LegacyPass in trunk. I mean…
		lxfindAuthorUnsubmitted Done Reply Inline Actions Yes I think so. I will deal with the legacypass latter. lxfind: Yes I think so. I will deal with the legacypass latter.
static char ID; // Pass identification, replacement for typeid		static char ID; // Pass identification, replacement for typeid

CoroSplitLegacy(bool ReuseFrameSlot = false)		CoroSplitLegacy(bool ReuseFrameSlot = false)
: CallGraphSCCPass(ID), ReuseFrameSlot(ReuseFrameSlot) {		: CallGraphSCCPass(ID), ReuseFrameSlot(ReuseFrameSlot) {
initializeCoroSplitLegacyPass(*PassRegistry::getPassRegistry());		initializeCoroSplitLegacyPass(*PassRegistry::getPassRegistry());
}		}

bool Run = false;		bool Run = false;
Show All 33 Lines	if (Coroutines.empty()) {
for (auto *PrepareFn : PrepareFns)		for (auto *PrepareFn : PrepareFns)
Changed \|= replaceAllPrepares(PrepareFn, CG);		Changed \|= replaceAllPrepares(PrepareFn, CG);
return Changed;		return Changed;
}		}

createDevirtTriggerFunc(CG, SCC);		createDevirtTriggerFunc(CG, SCC);

// Split all the coroutines.		// Split all the coroutines.
		// FIXME: adapt to the new split model
for (Function *F : Coroutines) {		for (Function *F : Coroutines) {
Attribute Attr = F->getFnAttribute(CORO_PRESPLIT_ATTR);		Attribute Attr = F->getFnAttribute(CORO_PRESPLIT_ATTR);
StringRef Value = Attr.getValueAsString();		StringRef Value = Attr.getValueAsString();
LLVM_DEBUG(dbgs() << "CoroSplit: Processing coroutine '" << F->getName()		LLVM_DEBUG(dbgs() << "CoroSplit: Processing coroutine '" << F->getName()
<< "' state: " << Value << "\n");		<< "' state: " << Value << "\n");
// Async lowering marks coroutines to trigger a restart of the pipeline		// Async lowering marks coroutines to trigger a restart of the pipeline
// after it has split them.		// after it has split them.
if (Value == ASYNC_RESTART_AFTER_SPLIT) {		if (Value == ASYNC_RESTART_AFTER_SPLIT) {
F->removeFnAttr(CORO_PRESPLIT_ATTR);		F->removeFnAttr(CORO_PRESPLIT_ATTR);
continue;		continue;
}		}
if (Value == UNPREPARED_FOR_SPLIT) {		if (Value == UNPREPARED_FOR_SPLIT_RAMP) {
prepareForSplit(*F, CG);		prepareForSplit(*F, CG);
continue;		continue;
}		}
F->removeFnAttr(CORO_PRESPLIT_ATTR);		F->removeFnAttr(CORO_PRESPLIT_ATTR);

SmallVector<Function *, 4> Clones;		SmallVector<Function *, 4> Clones;
const coro::Shape Shape = splitCoroutine(*F, Clones, ReuseFrameSlot);		const coro::Shape Shape = splitCoroutine(*F, Clones, ReuseFrameSlot);
updateCallGraphAfterCoroutineSplit(*F, Shape, Clones, CG, SCC);		updateCallGraphAfterCoroutineSplit(*F, Shape, Clones, CG, SCC);
Show All 38 Lines