This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
Bitcode/
-
LLVMBitCodes.h
-
IR/
-
Attributes.td
-
Function.h
-
lib/
-
Bitcode/
-
Reader/
-
BitcodeReader.cpp
-
Writer/
-
BitcodeWriter.cpp
-
Transforms/
-
Coroutines/
-
CoroSplit.cpp
-
Utils/
-
CodeExtractor.cpp
-
test/Transforms/Coroutines/
-
Transforms/
-
Coroutines/
-
coro-always_complete_coroutine.ll

Differential D134009

[Coroutines] Introduce `always_complete_coroutine` attribute to guide optimization (1/2)
Needs ReviewPublic

Authored by ChuanqiXu on Sep 15 2022, 8:06 PM.

Download Raw Diff

Details

Reviewers

rjmccall
ychen
nikic

Summary

Prepare for https://reviews.llvm.org/D134010.

From the perspective of the middle end, the attribute teaches the compiler how to reduce the control flow graph which was impossible.

Currently the optimization will reduce the size of the destroy function since we know the destroy function will only be called after the coroutine completes.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Sep 15 2022, 8:06 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 8:06 PM

Herald added subscribers: jdoerfert, hiraditya. · View Herald Transcript

ChuanqiXu requested review of this revision.Sep 15 2022, 8:06 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 8:06 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

ChuanqiXu edited the summary of this revision. (Show Details)Sep 15 2022, 8:20 PM

ChuanqiXu added reviewers: rjmccall, ychen.

ChuanqiXu set the repository for this revision to rG LLVM Github Monorepo.

Add the new line

ChuanqiXu added a child revision: D134010: [C++] [Coroutines] Introduce `coro_always_done` attribute to help optimizations (2/2).Sep 15 2022, 8:27 PM

Harbormaster completed remote builds in B187038: Diff 460620.Sep 15 2022, 9:18 PM

Can you please specify the semantics of the new attribute in LangRef?

Address comments.

In D134009#3794565, @nikic wrote:

Can you please specify the semantics of the new attribute in LangRef?

Oh, sorry I forgot due to I thought this in the frontend side.

Harbormaster completed remote builds in B187089: Diff 460677.Sep 16 2022, 3:12 AM

high-level design comment (I didn't look at the code; wouldn't understand it anyway):
I think there is another important case to cover, besides "a coroutine can only be destroyed after it ran to its final suspension point":
"A coroutine can only be destroy after it ran to its final suspension point *or* before it executed until after its initial suspension point"

Assuming an execution model similar to https://github.com/lewissbaker/cppcoro, i.e. coroutines are always constructed in suspended state, and there is a when_all function to combine multiple coroutine tasks, I tend to write code like

lazy<int> foo(std::unique_ptr<int>);
lazy<int> bar(std::unique_ptr<float>);

lazy<int> foobar() {
     lazy<int> coro1 = foo(make_unique<int>(12));
     lazy<int> coro2 = bar(make_unique<float>(1.2));
     auto [x,y] = co_await when_all(std::move(coro1), std::move(coro2));
     return x + y;
}

In this case, coro1 and coro2 can be destroyed after both of them finished execution (i.e. after the when_all returned).
In addition, coro1 could be destroyed before it is ever co_awaited on, in case bar throws.
However, if coro1/coro2 ever left their initial suspend point, i.e. if they were ever co_awaited on, we know for sure that they won't be destroyed while suspended at one of the intermediate suspension points.
I guess this usage pattern is rather common? If so, I think we should design this attribute, such that it also accomodates that use case...

In D134009#3796543, @avogelsgesang wrote:
high-level design comment (I didn't look at the code; wouldn't understand it anyway):
I think there is another important case to cover, besides "a coroutine can only be destroyed after it ran to its final suspension point":
"A coroutine can only be destroy after it ran to its final suspension point *or* before it executed until after its initial suspension point"

Assuming an execution model similar to https://github.com/lewissbaker/cppcoro, i.e. coroutines are always constructed in suspended state, and there is a when_all function to combine multiple coroutine tasks, I tend to write code like
lazy<int> foo(std::unique_ptr<int>);
lazy<int> bar(std::unique_ptr<float>);

lazy<int> foobar() {
     lazy<int> coro1 = foo(make_unique<int>(12));
     lazy<int> coro2 = bar(make_unique<float>(1.2));
     auto [x,y] = co_await when_all(std::move(coro1), std::move(coro2));
     return x + y;
}
In this case, coro1 and coro2 can be destroyed after both of them finished execution (i.e. after the when_all returned).
In addition, coro1 could be destroyed before it is ever co_awaited on, in case bar throws.
However, if coro1/coro2 ever left their initial suspend point, i.e. if they were ever co_awaited on, we know for sure that they won't be destroyed while suspended at one of the intermediate suspension points.
I guess this usage pattern is rather common? If so, I think we should design this attribute, such that it also accomodates that use case...

Yeah, it is about the design in the language side. I got your point. Generally, if we relax the restriction more, we can do less optimizations. If we want to use the attribute in the example case, then we can't optimize the destroy functions.
But I don't think we need to make the decision now. Since we can add more than one attribute! I mean we can add that attribute later if we found this is really useful. And it looks like not a blocking reason to me.

Agree, not a blocker

avogelsgesang added inline comments.Sep 19 2022, 6:51 AM

llvm/docs/Coroutines.rst
1732 ↗	(On Diff #460677)	I am a bit surprised that we introduce a new IR-attribute for this. Can't we reuse the existing branching structure of the suspension points? I could imagine that we could use the `switch` after each suspend to encode whether the coroutine can be destroyed at that suspension point. If we point the `switch` for `i8 1`, i.e. the cleanup control flow edge, to an unreachable basic block, the coroutine splitting should not optimize out the corresponding code from the `destroy` function already, doesn't it? I imagine something like // For this suspension point, we want to optimize under the assumption that the // coroutine will not be destroyed while suspended. %0 = call i8 @llvm.coro.suspend(token none, i1 false) // Hence we let the "cleanup" result (`i8 1`) branch to an "unreachable" block switch i8 %0, label %suspend [i8 0, label %continue i8 1, label %unreachable_bb] continue: // After the resumption point, execution continues here... // Let's assume we suspend a 2nd time. %1 = call i8 @llvm.coro.suspend(token none, i1 false) // At this suspension point, we allow the coroutine to be destructed. switch i8 %1, label %suspend [i8 0, label %continue2 i8 1, label %cleanup] continue2: // Some other stuff happens here... cleanup: %mem = call i8* @llvm.coro.free(token %id, i8* %hdl) call void @free(i8* %mem) br label %suspend suspend: %unused = call i1 @llvm.coro.end(i8* %hdl, i1 false) ret i8* %hdl unreachable_bb: unreachable } To prove the viability of this alternative approach, I copied the LLVM code for https://github.com/llvm/llvm-project/issues/56980 from https://godbolt.org/z/P84MPzq4q and manually replaced the `await.cleanup`, `await2.cleanup` ..., `await7.cleanup` basic blocks by "unreachable". You can find the changed LLVM code in https://godbolt.org/z/n93TTj8vo. As you can see, the generated `Foo() [clone .destroy]` function does not contain any code for the `GlobalSetter` destructors I think re-using the `switch` control flow is preferable over introducing a new attribute because: it allows for more fine-grained control: we can control for each suspension point independently whether the coroutine can be destroyed while suspended at this suspension point it reuses an existing concept instead of introducing a new attribute, thereby keeping LLVM's code simpler

ChuanqiXu added inline comments.Sep 19 2022, 7:12 PM

llvm/docs/Coroutines.rst
1732 ↗	(On Diff #460677)	Yeah, it is a good idea if we only want to reduce the destroy function. However, if we look at the following explanation for the control flow information, we can't infer such information any more by the trick. And the implementation of the patch is not complex indeed, if you look at the code actually, you'll find the meaningful change 5 lines of code in CoroSplit.cpp. Also I feel like the attribute is a general property, it may be possible for other languages to use it. So I guess it might not be a big deal to introduce it.

avogelsgesang added inline comments.Sep 20 2022, 3:46 AM

llvm/docs/Coroutines.rst
1732 ↗	(On Diff #460677)	However, if we look at the following explanation for the control flow information, we can't infer such information any more by the trick. Why not? From my understanding, given the `unreachable`, we should also be able to "envision the `ret `in` `bb{i} (i != n)` `as branch instruction to the next block` `bb{i+1}`` with a unknown function call". Yes, currently the control flow analysis does not yet treat the `unreachable` in a way such that it uses this additional control flow information, yet. But from what I can tell, also the `always_complete_coroutine` does not use the additional control flow, either. And I don't see a technical reason why inferring the additional control flow information from the `always_complete_coroutine` would be harder than inferring it from the `unreachable`. Or am I missing something? Also I feel like the attribute is a general property, it may be possible for other languages to use it Not sure I can follow. Afaict, the `unreachable` could just as well be used by other language frontends?

avogelsgesang mentioned this in D134010: [C++] [Coroutines] Introduce `coro_always_done` attribute to help optimizations (2/2).Sep 20 2022, 4:35 AM

ChuanqiXu abandoned this revision.Oct 16 2022, 8:00 PM

This comment was removed by ChuanqiXu.

Mis-operations

Revision Contents

Path

Size

llvm/

include/

llvm/

Bitcode/

LLVMBitCodes.h

1 line

IR/

Attributes.td

3 lines

Function.h

8 lines

lib/

Bitcode/

Reader/

BitcodeReader.cpp

2 lines

Writer/

BitcodeWriter.cpp

2 lines

Transforms/

Coroutines/

CoroSplit.cpp

17 lines

Utils/

CodeExtractor.cpp

1 line

test/

Transforms/

Coroutines/

coro-always_complete_coroutine.ll

139 lines

Diff 460620

llvm/include/llvm/Bitcode/LLVMBitCodes.h

Show First 20 Lines • Show All 684 Lines • ▼ Show 20 Lines	enum AttributeKindCodes {
ATTR_KIND_DISABLE_SANITIZER_INSTRUMENTATION = 78,		ATTR_KIND_DISABLE_SANITIZER_INSTRUMENTATION = 78,
ATTR_KIND_NO_SANITIZE_BOUNDS = 79,		ATTR_KIND_NO_SANITIZE_BOUNDS = 79,
ATTR_KIND_ALLOC_ALIGN = 80,		ATTR_KIND_ALLOC_ALIGN = 80,
ATTR_KIND_ALLOCATED_POINTER = 81,		ATTR_KIND_ALLOCATED_POINTER = 81,
ATTR_KIND_ALLOC_KIND = 82,		ATTR_KIND_ALLOC_KIND = 82,
ATTR_KIND_PRESPLIT_COROUTINE = 83,		ATTR_KIND_PRESPLIT_COROUTINE = 83,
ATTR_KIND_FNRETTHUNK_EXTERN = 84,		ATTR_KIND_FNRETTHUNK_EXTERN = 84,
ATTR_KIND_SKIP_PROFILE = 85,		ATTR_KIND_SKIP_PROFILE = 85,
		ATTR_KIND_ALWAYS_COMPLETE_COROUTINE = 86,
};		};

enum ComdatSelectionKindCodes {		enum ComdatSelectionKindCodes {
COMDAT_SELECTION_KIND_ANY = 1,		COMDAT_SELECTION_KIND_ANY = 1,
COMDAT_SELECTION_KIND_EXACT_MATCH = 2,		COMDAT_SELECTION_KIND_EXACT_MATCH = 2,
COMDAT_SELECTION_KIND_LARGEST = 3,		COMDAT_SELECTION_KIND_LARGEST = 3,
COMDAT_SELECTION_KIND_NO_DUPLICATES = 4,		COMDAT_SELECTION_KIND_NO_DUPLICATES = 4,
COMDAT_SELECTION_KIND_SAME_SIZE = 5,		COMDAT_SELECTION_KIND_SAME_SIZE = 5,
Show All 14 Lines

llvm/include/llvm/IR/Attributes.td

	Show First 20 Lines • Show All 308 Lines • ▼ Show 20 Lines
	def ZExt : EnumAttr<"zeroext", [ParamAttr, RetAttr]>;			def ZExt : EnumAttr<"zeroext", [ParamAttr, RetAttr]>;

	/// Function is required to make Forward Progress.			/// Function is required to make Forward Progress.
	def MustProgress : EnumAttr<"mustprogress", [FnAttr]>;			def MustProgress : EnumAttr<"mustprogress", [FnAttr]>;

	/// Function is a presplit coroutine.			/// Function is a presplit coroutine.
	def PresplitCoroutine : EnumAttr<"presplitcoroutine", [FnAttr]>;			def PresplitCoroutine : EnumAttr<"presplitcoroutine", [FnAttr]>;

				/// The coroutine is guaranteed to always complete.
				def AlwaysCompleteCoroutine : EnumAttr<"always_complete_coroutine", [FnAttr]>;

	/// Target-independent string attributes.			/// Target-independent string attributes.
	def LessPreciseFPMAD : StrBoolAttr<"less-precise-fpmad">;			def LessPreciseFPMAD : StrBoolAttr<"less-precise-fpmad">;
	def NoInfsFPMath : StrBoolAttr<"no-infs-fp-math">;			def NoInfsFPMath : StrBoolAttr<"no-infs-fp-math">;
	def NoNansFPMath : StrBoolAttr<"no-nans-fp-math">;			def NoNansFPMath : StrBoolAttr<"no-nans-fp-math">;
	def ApproxFuncFPMath : StrBoolAttr<"approx-func-fp-math">;			def ApproxFuncFPMath : StrBoolAttr<"approx-func-fp-math">;
	def NoSignedZerosFPMath : StrBoolAttr<"no-signed-zeros-fp-math">;			def NoSignedZerosFPMath : StrBoolAttr<"no-signed-zeros-fp-math">;
	def UnsafeFPMath : StrBoolAttr<"unsafe-fp-math">;			def UnsafeFPMath : StrBoolAttr<"unsafe-fp-math">;
	def NoJumpTables : StrBoolAttr<"no-jump-tables">;			def NoJumpTables : StrBoolAttr<"no-jump-tables">;
	▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Function.h

Show First 20 Lines • Show All 485 Lines • ▼ Show 20 Lines	public:

/// Determine if the function is presplit coroutine.		/// Determine if the function is presplit coroutine.
bool isPresplitCoroutine() const {		bool isPresplitCoroutine() const {
return hasFnAttribute(Attribute::PresplitCoroutine);		return hasFnAttribute(Attribute::PresplitCoroutine);
}		}
void setPresplitCoroutine() { addFnAttr(Attribute::PresplitCoroutine); }		void setPresplitCoroutine() { addFnAttr(Attribute::PresplitCoroutine); }
void setSplittedCoroutine() { removeFnAttr(Attribute::PresplitCoroutine); }		void setSplittedCoroutine() { removeFnAttr(Attribute::PresplitCoroutine); }

		/// Determine if the function is a always complete coroutine.
		bool isAlwaysCompleteCoroutine() const {
		return hasFnAttribute(Attribute::AlwaysCompleteCoroutine);
		}
		void setAlwaysPresplitCoroutine() {
		addFnAttr(Attribute::AlwaysCompleteCoroutine);
		}

/// Determine if the function does not access memory.		/// Determine if the function does not access memory.
bool doesNotAccessMemory() const {		bool doesNotAccessMemory() const {
return hasFnAttribute(Attribute::ReadNone);		return hasFnAttribute(Attribute::ReadNone);
}		}
void setDoesNotAccessMemory() {		void setDoesNotAccessMemory() {
addFnAttr(Attribute::ReadNone);		addFnAttr(Attribute::ReadNone);
}		}

▲ Show 20 Lines • Show All 419 Lines • Show Last 20 Lines

llvm/lib/Bitcode/Reader/BitcodeReader.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,009 Lines • ▼ Show 20 Lines	static Attribute::AttrKind getAttrFromCode(uint64_t Code) {
case bitc::ATTR_KIND_BYREF:		case bitc::ATTR_KIND_BYREF:
return Attribute::ByRef;		return Attribute::ByRef;
case bitc::ATTR_KIND_MUSTPROGRESS:		case bitc::ATTR_KIND_MUSTPROGRESS:
return Attribute::MustProgress;		return Attribute::MustProgress;
case bitc::ATTR_KIND_HOT:		case bitc::ATTR_KIND_HOT:
return Attribute::Hot;		return Attribute::Hot;
case bitc::ATTR_KIND_PRESPLIT_COROUTINE:		case bitc::ATTR_KIND_PRESPLIT_COROUTINE:
return Attribute::PresplitCoroutine;		return Attribute::PresplitCoroutine;
		case bitc::ATTR_KIND_ALWAYS_COMPLETE_COROUTINE:
		return Attribute::AlwaysCompleteCoroutine;
}		}
}		}

Error BitcodeReader::parseAlignmentValue(uint64_t Exponent,		Error BitcodeReader::parseAlignmentValue(uint64_t Exponent,
MaybeAlign &Alignment) {		MaybeAlign &Alignment) {
// Note: Alignment in bitcode files is incremented by 1, so that zero		// Note: Alignment in bitcode files is incremented by 1, so that zero
// can be used for default alignment.		// can be used for default alignment.
if (Exponent > Value::MaxAlignmentExponent + 1)		if (Exponent > Value::MaxAlignmentExponent + 1)
▲ Show 20 Lines • Show All 5,951 Lines • Show Last 20 Lines

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp

Show First 20 Lines • Show All 776 Lines • ▼ Show 20 Lines	static uint64_t getAttrKindEncoding(Attribute::AttrKind Kind) {
case Attribute::NoUndef:		case Attribute::NoUndef:
return bitc::ATTR_KIND_NOUNDEF;		return bitc::ATTR_KIND_NOUNDEF;
case Attribute::ByRef:		case Attribute::ByRef:
return bitc::ATTR_KIND_BYREF;		return bitc::ATTR_KIND_BYREF;
case Attribute::MustProgress:		case Attribute::MustProgress:
return bitc::ATTR_KIND_MUSTPROGRESS;		return bitc::ATTR_KIND_MUSTPROGRESS;
case Attribute::PresplitCoroutine:		case Attribute::PresplitCoroutine:
return bitc::ATTR_KIND_PRESPLIT_COROUTINE;		return bitc::ATTR_KIND_PRESPLIT_COROUTINE;
		case Attribute::AlwaysCompleteCoroutine:
		return bitc::ATTR_KIND_ALWAYS_COMPLETE_COROUTINE;
case Attribute::EndAttrKinds:		case Attribute::EndAttrKinds:
llvm_unreachable("Can not encode end-attribute kinds marker.");		llvm_unreachable("Can not encode end-attribute kinds marker.");
case Attribute::None:		case Attribute::None:
llvm_unreachable("Can not encode none-attribute.");		llvm_unreachable("Can not encode none-attribute.");
case Attribute::EmptyKey:		case Attribute::EmptyKey:
case Attribute::TombstoneKey:		case Attribute::TombstoneKey:
llvm_unreachable("Trying to encode EmptyKey/TombstoneKey");		llvm_unreachable("Trying to encode EmptyKey/TombstoneKey");
}		}
▲ Show 20 Lines • Show All 4,259 Lines • Show Last 20 Lines

llvm/lib/Transforms/Coroutines/CoroSplit.cpp

	Show First 20 Lines • Show All 467 Lines • ▼ Show 20 Lines
	// and the coroutine doesn't suspend at the final suspend point actually (this			// and the coroutine doesn't suspend at the final suspend point actually (this
	// is possible since the coroutine is considered suspended at the final suspend			// is possible since the coroutine is considered suspended at the final suspend
	// point if promise.unhandled_exception() exits via an exception), we can			// point if promise.unhandled_exception() exits via an exception), we can
	// remove the last case.			// remove the last case.
	void CoroCloner::handleFinalSuspend() {			void CoroCloner::handleFinalSuspend() {
	assert(Shape.ABI == coro::ABI::Switch &&			assert(Shape.ABI == coro::ABI::Switch &&
	Shape.SwitchLowering.HasFinalSuspend);			Shape.SwitchLowering.HasFinalSuspend);

	if (isSwitchDestroyFunction() && Shape.SwitchLowering.HasUnwindCoroEnd)
	return;

	auto *Switch = cast<SwitchInst>(VMap[Shape.SwitchLowering.ResumeSwitch]);			auto *Switch = cast<SwitchInst>(VMap[Shape.SwitchLowering.ResumeSwitch]);
	auto FinalCaseIt = std::prev(Switch->case_end());			auto FinalCaseIt = std::prev(Switch->case_end());
	BasicBlock *ResumeBB = FinalCaseIt->getCaseSuccessor();			BasicBlock *ResumeBB = FinalCaseIt->getCaseSuccessor();

				if (isSwitchDestroyFunction() && NewF->isAlwaysCompleteCoroutine()) {
				// If the coroutine is always complete coroutine, we know the destroy
				// function will only be called when the coroutine is done. So we could
				// eliminate other branches.
				Builder.SetInsertPoint(Switch);
				Builder.CreateBr(ResumeBB);
				Switch->eraseFromParent();
				return;
				}

				if (isSwitchDestroyFunction() && Shape.SwitchLowering.HasUnwindCoroEnd)
				return;

	Switch->removeCase(FinalCaseIt);			Switch->removeCase(FinalCaseIt);
	if (isSwitchDestroyFunction()) {			if (isSwitchDestroyFunction()) {
	BasicBlock *OldSwitchBB = Switch->getParent();			BasicBlock *OldSwitchBB = Switch->getParent();
	auto *NewSwitchBB = OldSwitchBB->splitBasicBlock(Switch, "Switch");			auto *NewSwitchBB = OldSwitchBB->splitBasicBlock(Switch, "Switch");
	Builder.SetInsertPoint(OldSwitchBB->getTerminator());			Builder.SetInsertPoint(OldSwitchBB->getTerminator());
	auto *GepIndex = Builder.CreateStructGEP(Shape.FrameTy, NewFramePtr,			auto *GepIndex = Builder.CreateStructGEP(Shape.FrameTy, NewFramePtr,
	coro::Shape::SwitchFieldIndex::Resume,			coro::Shape::SwitchFieldIndex::Resume,
	"ResumeFn.addr");			"ResumeFn.addr");
	▲ Show 20 Lines • Show All 1,675 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/CodeExtractor.cpp

Show First 20 Lines • Show All 916 Lines • ▼ Show 20 Lines	if (Attr.isStringAttribute()) {
case Attribute::ReadOnly:		case Attribute::ReadOnly:
case Attribute::ReturnsTwice:		case Attribute::ReturnsTwice:
case Attribute::Speculatable:		case Attribute::Speculatable:
case Attribute::StackAlignment:		case Attribute::StackAlignment:
case Attribute::WillReturn:		case Attribute::WillReturn:
case Attribute::WriteOnly:		case Attribute::WriteOnly:
case Attribute::AllocKind:		case Attribute::AllocKind:
case Attribute::PresplitCoroutine:		case Attribute::PresplitCoroutine:
		case Attribute::AlwaysCompleteCoroutine:
continue;		continue;
// Those attributes should be safe to propagate to the extracted function.		// Those attributes should be safe to propagate to the extracted function.
case Attribute::AlwaysInline:		case Attribute::AlwaysInline:
case Attribute::Cold:		case Attribute::Cold:
case Attribute::DisableSanitizerInstrumentation:		case Attribute::DisableSanitizerInstrumentation:
case Attribute::FnRetThunkExtern:		case Attribute::FnRetThunkExtern:
case Attribute::Hot:		case Attribute::Hot:
case Attribute::NoRecurse:		case Attribute::NoRecurse:
▲ Show 20 Lines • Show All 953 Lines • Show Last 20 Lines

llvm/test/Transforms/Coroutines/coro-always_complete_coroutine.ll

This file was added.

				; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse,sccp,simplifycfg,early-cse' -S \| FileCheck %s

				%"struct.std::coroutine_traits<void>::promise_type" = type { i8 }
				%"struct.std::coroutine_handle" = type { i8 }
				%struct.Cleanup = type { i8 }

				; Function Attrs: presplitcoroutine always_complete_coroutine uwtable
				define dso_local void @_Z1fv() #0 personality ptr @__gxx_personality_v0 {
				entry:
				%__promise = alloca %"struct.std::coroutine_traits<void>::promise_type", align 1
				%agg.tmp = alloca %"struct.std::coroutine_handle", align 1
				%cleanup4 = alloca %struct.Cleanup, align 1
				%agg.tmp7 = alloca %"struct.std::coroutine_handle", align 1
				%agg.tmp19 = alloca %"struct.std::coroutine_handle", align 1
				%0 = bitcast ptr %__promise to ptr
				%1 = call token @llvm.coro.id(i32 16, ptr %0, ptr @_Z1fv, ptr null)
				%2 = call i1 @llvm.coro.alloc(token %1)
				br i1 %2, label %coro.alloc, label %coro.init

				coro.alloc: ; preds = %entry
				%3 = call i64 @llvm.coro.size.i64()
				%call = call noalias noundef nonnull ptr @_Znwm(i64 noundef %3) #8
				br label %coro.init

				coro.init: ; preds = %coro.alloc, %entry
				%4 = phi ptr [ null, %entry ], [ %call, %coro.alloc ]
				%5 = call ptr @llvm.coro.begin(token %1, ptr %4) #9
				call void @llvm.lifetime.start.p0(i64 1, ptr %__promise) #2
				%6 = call token @llvm.coro.save(ptr null)
				call void @_ZNSt16coroutine_handleINSt16coroutine_traitsIJvEE12promise_typeEE12from_addressEPv(ptr noundef %5) #2
				call void @_ZNSt16coroutine_handleIvEC1INSt16coroutine_traitsIJvEE12promise_typeEEES_IT_E(ptr noundef nonnull align 1 dereferenceable(1) %agg.tmp) #2
				%7 = call i8 @llvm.coro.suspend(token %6, i1 false)
				switch i8 %7, label %coro.ret [
				i8 0, label %init.ready
				i8 1, label %cleanup
				]

				init.ready: ; preds = %coro.init
				br label %cleanup

				cleanup: ; preds = %init.ready, %coro.init
				%cleanup.dest.slot.0 = phi i32 [ 0, %init.ready ], [ 2, %coro.init ]
				%cond = icmp eq i32 %cleanup.dest.slot.0, 0
				br i1 %cond, label %cleanup.cont, label %cleanup25

				cleanup.cont: ; preds = %cleanup
				call void @llvm.lifetime.start.p0(i64 1, ptr %cleanup4) #2
				%8 = call token @llvm.coro.save(ptr null)
				call void @_ZNSt16coroutine_handleINSt16coroutine_traitsIJvEE12promise_typeEE12from_addressEPv(ptr noundef %5) #2
				call void @_ZNSt16coroutine_handleIvEC1INSt16coroutine_traitsIJvEE12promise_typeEEES_IT_E(ptr noundef nonnull align 1 dereferenceable(1) %agg.tmp7) #2
				%9 = call i8 @llvm.coro.suspend(token %8, i1 false)
				switch i8 %9, label %coro.ret [
				i8 0, label %await.ready
				i8 1, label %cleanup10
				]

				await.ready: ; preds = %cleanup.cont
				br label %cleanup10

				cleanup10: ; preds = %await.ready, %cleanup.cont
				%cleanup.dest.slot.1 = phi i32 [ 0, %await.ready ], [ 2, %cleanup.cont ]
				%cond1 = icmp eq i32 %cleanup.dest.slot.1, 0
				%spec.select = select i1 %cond1, i32 3, i32 %cleanup.dest.slot.1
				call void @_ZN7CleanupD1Ev(ptr noundef nonnull align 1 dereferenceable(1) %cleanup4) #2
				call void @llvm.lifetime.end.p0(i64 1, ptr %cleanup4) #2
				%cond2 = icmp eq i32 %spec.select, 3
				br i1 %cond2, label %coro.final, label %cleanup25

				coro.final: ; preds = %cleanup10
				%10 = call token @llvm.coro.save(ptr null)
				call void @_ZNSt16coroutine_handleINSt16coroutine_traitsIJvEE12promise_typeEE12from_addressEPv(ptr noundef %5) #2
				call void @_ZNSt16coroutine_handleIvEC1INSt16coroutine_traitsIJvEE12promise_typeEEES_IT_E(ptr noundef nonnull align 1 dereferenceable(1) %agg.tmp19) #2
				%11 = call i8 @llvm.coro.suspend(token %10, i1 true) #9
				switch i8 %11, label %coro.ret [
				i8 0, label %cleanup22
				i8 1, label %cleanup22
				]

				cleanup22: ; preds = %coro.final, %coro.final
				br label %cleanup25

				cleanup25: ; preds = %cleanup22, %cleanup10, %cleanup
				call void @llvm.lifetime.end.p0(i64 1, ptr %__promise) #2
				%12 = call ptr @llvm.coro.free(token %1, ptr %5)
				%13 = icmp ne ptr %12, null
				br i1 %13, label %coro.free, label %coro.ret

				coro.free: ; preds = %cleanup25
				call void @_ZdlPv(ptr noundef %12) #2
				br label %coro.ret

				coro.ret: ; preds = %coro.free, %cleanup25, %coro.final, %cleanup.cont, %coro.init
				%14 = call i1 @llvm.coro.end(ptr null, i1 false) #9
				ret void
				}

				; CHECK: define{{.*}}void @_Z1fv.destroy
				; CHECK-NEXT: entry.destroy:
				; CHECK-NEXT: call void @_ZdlPv
				; CHECK-NEXT: ret

				; CHECK: define{{.*}}void @_Z1fv.cleanup
				; CHECK-NEXT: entry.cleanup:
				; CHECK-NEXT: ret

				declare token @llvm.coro.id(i32, ptr readnone, ptr nocapture readonly, ptr) #1
				declare i1 @llvm.coro.alloc(token) #2
				declare dso_local noundef nonnull ptr @_Znwm(i64 noundef) #3
				declare i64 @llvm.coro.size.i64() #4
				declare ptr @llvm.coro.begin(token, ptr writeonly) #2
				declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #5
				declare dso_local i32 @__gxx_personality_v0(...)
				declare token @llvm.coro.save(ptr) #6
				declare dso_local void @_ZNSt16coroutine_handleINSt16coroutine_traitsIJvEE12promise_typeEE12from_addressEPv(ptr noundef) #2
				declare dso_local void @_ZNSt16coroutine_handleIvEC1INSt16coroutine_traitsIJvEE12promise_typeEEES_IT_E(ptr noundef nonnull align 1 dereferenceable(1)) unnamed_addr #2
				declare i8 @llvm.coro.suspend(token, i1) #2
				declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture) #5
				declare dso_local void @_ZN7CleanupD1Ev(ptr noundef nonnull align 1 dereferenceable(1)) unnamed_addr #2
				declare dso_local void @_ZdlPv(ptr noundef) #7
				declare ptr @llvm.coro.free(token, ptr nocapture readonly) #1
				declare i1 @llvm.coro.end(ptr, i1) #2

				attributes #0 = { presplitcoroutine always_complete_coroutine uwtable }
				attributes #1 = { argmemonly nounwind readonly }
				attributes #2 = { nounwind }
				attributes #3 = { nobuiltin allocsize(0) }
				attributes #4 = { nounwind readnone }
				attributes #5 = { argmemonly nocallback nofree nosync nounwind willreturn }
				attributes #6 = { nomerge nounwind }
				attributes #7 = { nobuiltin nounwind }
				attributes #8 = { allocsize(0) }
				attributes #9 = { noduplicate }

				!llvm.linker.options = !{}
				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 7, !"uwtable", i32 2}