This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/
-
Passes/
1/6
PassBuilder.cpp
-
Transforms/Coroutines/
-
Coroutines/
4/17
CoroSplit.cpp
-
test/Transforms/Coroutines/
-
Transforms/
-
Coroutines/
-
coro-alloc-with-param-O0.ll
-
coro-alloca-01.ll
-
coro-alloca-04.ll
-
coro-alloca-05.ll
-
restart-trigger.ll

Differential D95807

[Coroutines] Add the newly generated SCCs back to the CGSCC work queue after CoroSplit actually happened
ClosedPublic

Authored by lxfind on Feb 1 2021, 11:28 AM.

Download Raw Diff

Details

Reviewers

aeubanks
junparser
ChuanqiXu
rjmccall
asbirlea

Commits

rG822b92aae439: [Coroutines] Add the newly generated SCCs back to the CGSCC work queue after…

Summary

Relevant discussion can be found at: https://lists.llvm.org/pipermail/llvm-dev/2021-January/148197.html
In the existing design, An SCC that contains a coroutine will go through the folloing passes:
Inliner -> CoroSplitPass (fake) -> FunctionSimplificationPipeline -> Inliner -> CoroSplitPass (real) -> FunctionSimplificationPipeline

The first CoroSplitPass doesn't do anything other than putting the SCC back to the queue so that the entire pipeline can repeat.
As you can see, we run Inliner twice on the SCC consecutively without doing any real split, which is unnecessary and likely unintended.
What we really wanted is this:
Inliner -> FunctionSimplificationPipeline -> CoroSplitPass -> FunctionSimplificationPipeline
(note that we don't really need to run Inliner again on the ramp function after split).

Hence the way we do it here is to move CoroSplitPass to the end of the CGSCC pipeline, make it once for real, insert the newly generated SCCs (the clones) back to the pipeline so that they can be optimized, and also add a function simplification pipeline after CoroSplit to optimize the post-split ramp function.

This approach also conforms to how the new pass manager works instead of relying on an adhoc post split cleanup, making it ready for full switch to new pass manager eventually.

By looking at some of the changes to the tests, we can already observe that this changes allows for more optimizations applied to coroutines.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	130 ms	x64 debian > Clang.CodeGenCoroutines::coro-newpm-pipeline.cpp
	70 ms	x64 debian > LLVM.Transforms/Coroutines::coro-alloca-06.ll
	90 ms	x64 debian > LLVM.Transforms/Coroutines::coro-async.ll
	50 ms	x64 debian > LLVM.Transforms/Coroutines::coro-catchswitch-cleanuppad.ll
	60 ms	x64 debian > LLVM.Transforms/Coroutines::coro-catchswitch.ll
		View Full Test Results (68 Failed)

Event Timeline

lxfind created this revision.Feb 1 2021, 11:28 AM

Herald added subscribers: hoy, modimo, wenlei, hiraditya. · View Herald TranscriptFeb 1 2021, 11:28 AM

lxfind requested review of this revision.Feb 1 2021, 11:28 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2021, 11:28 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B87395: Diff 320539.Feb 1 2021, 12:22 PM

aeubanks added inline comments.Feb 1 2021, 12:33 PM

llvm/lib/Passes/PassBuilder.cpp
625	we can still keep this here right? since we'll run the function simplification pipeline on the split coroutine also, this should be duplicated in `buildFunctionSimplificationPipeline()`
llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1710	this could regress coroutines under the legacy PM behavior since we don't run `postSplitCleanup()` here? but maybe we don't care about the legacy PM so much anymore.
2004	are all `Clones` guaranteed to be in different SCC?

lxfind added inline comments.Feb 1 2021, 12:46 PM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1710	Not really. CoroSplit for legacy PM works in a similar way. Function simplifications happen after CoroSplit (if not O0) as well.
2004	No. But CWorklist is a map so redundant insertions won't happen.

I can't really review the subtle pass-management stuff here, but if you're going to be able to eliminate the hacky post-pass cleanup pipeline, I'm overjoyed.

lxfind added inline comments.Feb 1 2021, 1:41 PM

llvm/lib/Passes/PassBuilder.cpp
625	We could. It does create a logically strange situation, that CoroElide will run once before CoroSplit. How to decide whether CoroElide should go into simplification pipeline or optimization pipeline?

ChuanqiXu added inline comments.Feb 1 2021, 6:21 PM

llvm/lib/Passes/PassBuilder.cpp
625	I agree with that it is strange that CoroElide would run once before CoroSplit. But if we only run CoroElide after CoroSplit, we may miss many optimization point. Given that a coroutine A was inlined into a coroutine B, and CoroElide run on B before the CoroSplit for B. define B(...) { entry: %A.handle = ...; The handle for A ;... ; Some operations with A call @llvm.coro.suspend(%B.handle, false); which is possible if the implementation of initial_suspend of the promise type of B isn't std::experimental::suspend_always ;... Some other operations with A call @llvm.subfn.addr(%A.handle, 1) ; means the end for the lifetime of A } In this example, CoroElide would find that every path from the definition of %A.handle to the exit point of B would pass `llvm.subfn.addr(%A.handle, 1)` which means the end for the lifetime of `%A.handle` so that CoroElide may find that `%A.handle` isn't escaped in B. So it is possible that A maybe elided in B. However if we run CoroElide only if B have been split, we may miss the opportunity to elide A in B. I would give the example if needed. I agree with that the design of CoroElide seems to be a little strange and need to be refactored likely. But I don't get a clear and feasible solution now.
llvm/lib/Transforms/Coroutines/CoroSplit.cpp
2001	I am not familiar with the Shape.ABI other than coro::ABI:switch. But the diff line seems strange, it looks like that condition gets weaker.

ChuanqiXu added inline comments.Feb 1 2021, 6:26 PM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1032	Is it necessary to remain the verify process?

rjmccall added inline comments.Feb 1 2021, 9:51 PM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1032	It's probably okay for it to go away now.

ChuanqiXu added inline comments.Feb 2 2021, 1:27 AM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1032	I prefer to remain the verification personally since it had reported many errors for me :).

By the way, is everyone comfortable with dropping the test lines using the legacy pass manager for Coroutine?
It's really painful to update the tests to keep both legacy pass manager and new pass manager, now that the pass sequence becomes much different. (CoroSplit runs late in CGSCC pipeline for NewPM, but early in legacy pm).
If we want to keep the legacy PM tests for a while for coroutine, I could wait until everyone is comfortable before updating this patch again.

In D95807#2537105, @lxfind wrote:

By the way, is everyone comfortable with dropping the test lines using the legacy pass manager for Coroutine?
It's really painful to update the tests to keep both legacy pass manager and new pass manager, now that the pass sequence becomes much different. (CoroSplit runs late in CGSCC pipeline for NewPM, but early in legacy pm).
If we want to keep the legacy PM tests for a while for coroutine, I could wait until everyone is comfortable before updating this patch again.

What's most important is to test the pass. If you want to port most of the tests to unconditionally test under the new PM, that's fine with me. We should keep some tests that validate basic full-coroutine-pipeline behavior under the old PM until we actually retire the old PM, though.

In D95807#2537105, @lxfind wrote:

By the way, is everyone comfortable with dropping the test lines using the legacy pass manager for Coroutine?
It's really painful to update the tests to keep both legacy pass manager and new pass manager, now that the pass sequence becomes much different. (CoroSplit runs late in CGSCC pipeline for NewPM, but early in legacy pm).
If we want to keep the legacy PM tests for a while for coroutine, I could wait until everyone is comfortable before updating this patch again.

I think testing is more important than the develop experience. It is more suitable to remove the tests for legacy pass manager when the old pass manager got replaced entirely.

aeubanks added inline comments.Feb 3 2021, 11:37 AM

llvm/lib/Passes/PassBuilder.cpp
625	The simplification pipeline is anything that simplifies/canonicalizes IR. Passes in the optimization may add complexity. If CoroElide mostly ends up cleaning things, it should definitely be in the simplification pipeline. I'm not super familiar with the coroutine passes, but given that we now unconditionally re-add the current SCC to the queue, CoroElide should run before and after CoroSplit on any given function that is split. If that's good enough then I'd say keep this here.
llvm/lib/Transforms/Coroutines/CoroSplit.cpp
2001	I believe that's intentional, and a big part of this patch. We want to re-add the current SCC (and the split SCCs) any time we split an SCC. Before we weren't properly doing that.
2004	Oh I didn't know that, thanks!

ChuanqiXu added inline comments.Feb 3 2021, 6:30 PM

llvm/lib/Passes/PassBuilder.cpp
625	CoroElide pass would try to transform heap allocation to stack allocation, which is a major optimization in Coroutine Passes. To my understanding, it isn't equal to simplification. I am not pretty sure about this. I prefer to run CoroElide before and after CoroSplit for now even it is strange.

aeubanks added inline comments.Feb 3 2021, 7:30 PM

llvm/lib/Passes/PassBuilder.cpp
625	Your description of the pass sounds like simplification/canonicalization to me.

In D95807#2538160, @ChuanqiXu wrote:

In D95807#2537105, @lxfind wrote:

By the way, is everyone comfortable with dropping the test lines using the legacy pass manager for Coroutine?
It's really painful to update the tests to keep both legacy pass manager and new pass manager, now that the pass sequence becomes much different. (CoroSplit runs late in CGSCC pipeline for NewPM, but early in legacy pm).
If we want to keep the legacy PM tests for a while for coroutine, I could wait until everyone is comfortable before updating this patch again.

I think testing is more important than the develop experience. It is more suitable to remove the tests for legacy pass manager when the old pass manager got replaced entirely.

By the way, since the new PM has been enabled by default, all of the tests are no longer testing Legacy PM behavior anymore (unless explicitly specified --enable-new-pm=0)

In D95807#2554905, @lxfind wrote:

In D95807#2538160, @ChuanqiXu wrote:

In D95807#2537105, @lxfind wrote:

By the way, is everyone comfortable with dropping the test lines using the legacy pass manager for Coroutine?
It's really painful to update the tests to keep both legacy pass manager and new pass manager, now that the pass sequence becomes much different. (CoroSplit runs late in CGSCC pipeline for NewPM, but early in legacy pm).
If we want to keep the legacy PM tests for a while for coroutine, I could wait until everyone is comfortable before updating this patch again.

I think testing is more important than the develop experience. It is more suitable to remove the tests for legacy pass manager when the old pass manager got replaced entirely.

By the way, since the new PM has been enabled by default, all of the tests are no longer testing Legacy PM behavior anymore (unless explicitly specified --enable-new-pm=0)

To my understand, this means that it is OK to remain the Legacy PM test lines? Since the legacy PM test line would be ignored?

lxfind planned changes to this revision.Feb 18 2021, 8:57 AM

Since we seem to be nearly done with the legacy PM, I think it's acceptable to no longer test it specifically for the coro passes.

junparser mentioned this in D96928: [LICM][Coroutine] Don't sink stores from loops with coro.suspend instructions.Feb 19 2021, 6:20 PM

After removing the legacy test command, I was finally able to update this patch. It's now ready for review. I will update the decription to reflect to the latest changes

Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2021, 9:22 PM

Herald added subscribers: cfe-commits, qcolombet. · View Herald Transcript

lxfind retitled this revision from [RFC][Coroutines] Add the newly generated SCCs back to the CGSCC work queue after CoroSplit actually happened to [Coroutines] Add the newly generated SCCs back to the CGSCC work queue after CoroSplit actually happened.Jun 28 2021, 9:25 PM

lxfind edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B111439: Diff 355110.Jun 28 2021, 9:59 PM

note that we don't really need to run Inliner again on the ramp function after split

This isn't accurate. The inline may run again for ramp function after split and it's required by coro elide.

It seems like that we don't need the attribute CORO_PRESPLIT_ATTR any more, do we? If yes, I think we should remove them.

In D95807#2846358, @ChuanqiXu wrote:

note that we don't really need to run Inliner again on the ramp function after split

This isn't accurate. The inline may run again for ramp function after split and it's required by coro elide.

If there is an inlining opportunity, it should have happened pre-split, right? Is there any reason it didn't happen pre-split but only post-split?

It seems like that we don't need the attribute CORO_PRESPLIT_ATTR any more, do we? If yes, I think we should remove them.

It's still needed by the legacy pass manager. I don't want to break that yet.

In D95807#2847449, @lxfind wrote:

In D95807#2846358, @ChuanqiXu wrote:

note that we don't really need to run Inliner again on the ramp function after split

This isn't accurate. The inline may run again for ramp function after split and it's required by coro elide.

If there is an inlining opportunity, it should have happened pre-split, right? Is there any reason it didn't happen pre-split but only post-split?

Since we had prevent inlining coroutine before split, coroutine can't be inlined pre-split. After splitting, due to the SCC changes, the inliner would run again for ramp.

LGTM. But I am not familiar with the pipeline, please wait for 1~2 days in case there are more comments.

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1992–1995	These are not needed.

This revision is now accepted and ready to land.Jun 29 2021, 6:55 PM

this will run the function simplification pipeline twice on every single function when coroutines are enabled, I don't think that's the intention

I thought the intention was to do all the the re-adding of SCCs inside CoroSplit.cpp, including the SCC with the function that was split

In D95807#2849053, @aeubanks wrote:

this will run the function simplification pipeline twice on every single function when coroutines are enabled, I don't think that's the intention

I thought the intention was to do all the the re-adding of SCCs inside CoroSplit.cpp, including the SCC with the function that was split

Good point. I was trying to avoid the second inliner on the coroutine ramp function. But I guess the cost will be bigger than the win.

In D95807#2849120, @lxfind wrote:

In D95807#2849053, @aeubanks wrote:

this will run the function simplification pipeline twice on every single function when coroutines are enabled, I don't think that's the intention

I thought the intention was to do all the the re-adding of SCCs inside CoroSplit.cpp, including the SCC with the function that was split

Good point. I was trying to avoid the second inliner on the coroutine ramp function. But I guess the cost will be bigger than the win.

If coroutine ramp function couldn't get inlined, it would disable coroutine elide optimization. Could you elaborate more on why do you want to do that?

If coroutine ramp function couldn't get inlined, it would disable coroutine elide optimization. Could you elaborate more on why do you want to do that?

Ramp function will eventually be inlined, but not when you run Inliner on the inlinee.
Let's say coroutine A calls coroutine B, and eventually we want to inline B into A so that we could perform CoroElide on A.
After B is split, we don't need to run inliner again on B. When we run inliner on A, A will inline B.

In D95807#2849128, @lxfind wrote:

If coroutine ramp function couldn't get inlined, it would disable coroutine elide optimization. Could you elaborate more on why do you want to do that?

Ramp function will eventually be inlined, but not when you run Inliner on the inlinee.
Let's say coroutine A calls coroutine B, and eventually we want to inline B into A so that we could perform CoroElide on A.
After B is split, we don't need to run inliner again on B. When we run inliner on A, A will inline B.

Thanks for clarifying.

Put the post-split ramp function back to the CGSCC worklist

Harbormaster completed remote builds in B111669: Diff 355431.Jun 29 2021, 10:43 PM

ChuanqiXu added inline comments.Jun 29 2021, 11:00 PM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1992–1995	Refactor this into: LLVM_DEBUG(dbgs() << "CoroSplit: Processing coroutine '" << F.getName() << "' state: " << Attr.getValueAsString() << "\n"); could erase an warning in release build.

The CoroSplit/PassBuilder changes lgtm

lxfind added inline comments.Jun 30 2021, 9:52 AM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1992–1995	Good catch. How did you catch this? It seems like I don't see warning on my Mac by default.

fix warning

ychen added a subscriber: ychen.Jun 30 2021, 10:10 AM

ychen added inline comments.

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1995	drive-by nit: would it be better to move it up near `Coroutines.push_back(&N);`?
2001	I got your point. So "// All clones will be in the same RefSCC ...." : this is not accurate I think?

lxfind added inline comments.Jun 30 2021, 10:17 AM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
2001	Note that previously this is done only for Async, Retcon and RetconOnce ABIs, not for the Switch ABI. I guess that's accurate for those ABIs? But for Switch ABI this is not true. And before we were not adding back the split functions to the pipeline to be properly optimized. Now we are dong that. This should help improve the performance of the post-split functions.

Harbormaster completed remote builds in B111791: Diff 355607.Jun 30 2021, 11:17 AM

This revision was landed with ongoing or failed builds.Jun 30 2021, 11:38 AM

Closed by commit rG822b92aae439: [Coroutines] Add the newly generated SCCs back to the CGSCC work queue after… (authored by lxfind). · Explain Why

This revision was automatically updated to reflect the committed changes.

lxfind added a commit: rG822b92aae439: [Coroutines] Add the newly generated SCCs back to the CGSCC work queue after….

ychen added inline comments.Jun 30 2021, 11:56 AM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
2001	I missed the ABI condition. Thanks for the explanation.

Revision Contents

Path

Size

llvm/

lib/

Passes/

PassBuilder.cpp

13 lines

Transforms/

Coroutines/

CoroSplit.cpp

53 lines

test/

Transforms/

Coroutines/

coro-alloc-with-param-O0.ll

9 lines

4 lines

7 lines

6 lines

7 lines

Diff 320539

llvm/lib/Passes/PassBuilder.cpp

Show First 20 Lines • Show All 615 Lines • ▼ Show 20 Lines	PassBuilder::buildO1FunctionSimplificationPipeline(OptimizationLevel Level,
// opportunities that creates).		// opportunities that creates).
FPM.addPass(BDCEPass());		FPM.addPass(BDCEPass());

// Run instcombine after redundancy and dead bit elimination to exploit		// Run instcombine after redundancy and dead bit elimination to exploit
// opportunities opened up by them.		// opportunities opened up by them.
FPM.addPass(InstCombinePass());		FPM.addPass(InstCombinePass());
invokePeepholeEPCallbacks(FPM, Level);		invokePeepholeEPCallbacks(FPM, Level);

if (PTO.Coroutines)
FPM.addPass(CoroElidePass());
aeubanksUnsubmitted Not Done Reply Inline Actions we can still keep this here right? since we'll run the function simplification pipeline on the split coroutine also, this should be duplicated in `buildFunctionSimplificationPipeline()` aeubanks: we can still keep this here right? since we'll run the function simplification pipeline on the…
lxfindAuthorUnsubmitted Done Reply Inline Actions We could. It does create a logically strange situation, that CoroElide will run once before CoroSplit. How to decide whether CoroElide should go into simplification pipeline or optimization pipeline? lxfind: We could. It does create a logically strange situation, that CoroElide will run once before…
ChuanqiXuUnsubmitted Not Done Reply Inline Actions I agree with that it is strange that CoroElide would run once before CoroSplit. But if we only run CoroElide after CoroSplit, we may miss many optimization point. Given that a coroutine A was inlined into a coroutine B, and CoroElide run on B before the CoroSplit for B. define B(...) { entry: %A.handle = ...; The handle for A ;... ; Some operations with A call @llvm.coro.suspend(%B.handle, false); which is possible if the implementation of initial_suspend of the promise type of B isn't std::experimental::suspend_always ;... Some other operations with A call @llvm.subfn.addr(%A.handle, 1) ; means the end for the lifetime of A } In this example, CoroElide would find that every path from the definition of %A.handle to the exit point of B would pass `llvm.subfn.addr(%A.handle, 1)` which means the end for the lifetime of `%A.handle` so that CoroElide may find that `%A.handle` isn't escaped in B. So it is possible that A maybe elided in B. However if we run CoroElide only if B have been split, we may miss the opportunity to elide A in B. I would give the example if needed. I agree with that the design of CoroElide seems to be a little strange and need to be refactored likely. But I don't get a clear and feasible solution now. ChuanqiXu: I agree with that it is strange that CoroElide would run once before CoroSplit. But if we only…
aeubanksUnsubmitted Not Done Reply Inline Actions The simplification pipeline is anything that simplifies/canonicalizes IR. Passes in the optimization may add complexity. If CoroElide mostly ends up cleaning things, it should definitely be in the simplification pipeline. I'm not super familiar with the coroutine passes, but given that we now unconditionally re-add the current SCC to the queue, CoroElide should run before and after CoroSplit on any given function that is split. If that's good enough then I'd say keep this here. aeubanks: The simplification pipeline is anything that simplifies/canonicalizes IR. Passes in the…
ChuanqiXuUnsubmitted Not Done Reply Inline Actions CoroElide pass would try to transform heap allocation to stack allocation, which is a major optimization in Coroutine Passes. To my understanding, it isn't equal to simplification. I am not pretty sure about this. I prefer to run CoroElide before and after CoroSplit for now even it is strange. ChuanqiXu: CoroElide pass would try to transform heap allocation to stack allocation, which is a major…
aeubanksUnsubmitted Not Done Reply Inline Actions Your description of the pass sounds like simplification/canonicalization to me. aeubanks: Your description of the pass sounds like simplification/canonicalization to me.

for (auto &C : ScalarOptimizerLateEPCallbacks)		for (auto &C : ScalarOptimizerLateEPCallbacks)
C(FPM, Level);		C(FPM, Level);

// Finally, do an expensive DCE pass to catch all the dead code exposed by		// Finally, do an expensive DCE pass to catch all the dead code exposed by
// the simplifications and basic cleanup after all the simplifications.		// the simplifications and basic cleanup after all the simplifications.
// TODO: Investigate if this is too expensive.		// TODO: Investigate if this is too expensive.
FPM.addPass(ADCEPass());		FPM.addPass(ADCEPass());
FPM.addPass(SimplifyCFGPass());		FPM.addPass(SimplifyCFGPass());
▲ Show 20 Lines • Show All 323 Lines • ▼ Show 20 Lines	PassBuilder::buildInlinerPipeline(OptimizationLevel Level,
// Note: historically, the PruneEH pass was run first to deduce nounwind and		// Note: historically, the PruneEH pass was run first to deduce nounwind and
// generally clean up exception handling overhead. It isn't clear this is		// generally clean up exception handling overhead. It isn't clear this is
// valuable as the inliner doesn't currently care whether it is inlining an		// valuable as the inliner doesn't currently care whether it is inlining an
// invoke or a call.		// invoke or a call.

if (AttributorRun & AttributorRunOption::CGSCC)		if (AttributorRun & AttributorRunOption::CGSCC)
MainCGPipeline.addPass(AttributorCGSCCPass());		MainCGPipeline.addPass(AttributorCGSCCPass());

if (PTO.Coroutines)
MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0));

// Now deduce any function attributes based in the current code.		// Now deduce any function attributes based in the current code.
MainCGPipeline.addPass(PostOrderFunctionAttrsPass());		MainCGPipeline.addPass(PostOrderFunctionAttrsPass());

// When at O3 add argument promotion to the pass pipeline.		// When at O3 add argument promotion to the pass pipeline.
// FIXME: It isn't at all clear why this should be limited to O3.		// FIXME: It isn't at all clear why this should be limited to O3.
if (Level == OptimizationLevel::O3)		if (Level == OptimizationLevel::O3)
MainCGPipeline.addPass(ArgumentPromotionPass());		MainCGPipeline.addPass(ArgumentPromotionPass());

// Try to perform OpenMP specific optimizations. This is a (quick!) no-op if		// Try to perform OpenMP specific optimizations. This is a (quick!) no-op if
// there are no OpenMP runtime calls present in the module.		// there are no OpenMP runtime calls present in the module.
if (Level == OptimizationLevel::O2 \|\| Level == OptimizationLevel::O3)		if (Level == OptimizationLevel::O2 \|\| Level == OptimizationLevel::O3)
MainCGPipeline.addPass(OpenMPOptPass());		MainCGPipeline.addPass(OpenMPOptPass());

for (auto &C : CGSCCOptimizerLateEPCallbacks)		for (auto &C : CGSCCOptimizerLateEPCallbacks)
C(MainCGPipeline, Level);		C(MainCGPipeline, Level);

// Lastly, add the core function simplification pipeline nested inside the		// Lastly, add the core function simplification pipeline nested inside the
// CGSCC walk.		// CGSCC walk.
MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(		MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(
buildFunctionSimplificationPipeline(Level, Phase)));		buildFunctionSimplificationPipeline(Level, Phase)));

		if (PTO.Coroutines)
		MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0));

return MIWP;		return MIWP;
}		}

ModulePassManager		ModulePassManager
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,		PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
ThinOrFullLTOPhase Phase) {		ThinOrFullLTOPhase Phase) {
ModulePassManager MPM(DebugLogging);		ModulePassManager MPM(DebugLogging);

▲ Show 20 Lines • Show All 342 Lines • ▼ Show 20 Lines	PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
// resulted in single-entry-single-exit or empty blocks. Clean up the CFG.		// resulted in single-entry-single-exit or empty blocks. Clean up the CFG.
OptimizePM.addPass(SimplifyCFGPass());		OptimizePM.addPass(SimplifyCFGPass());

// Optimize PHIs by speculating around them when profitable. Note that this		// Optimize PHIs by speculating around them when profitable. Note that this
// pass needs to be run after any PRE or similar pass as it is essentially		// pass needs to be run after any PRE or similar pass as it is essentially
// inserting redundancies into the program. This even includes SimplifyCFG.		// inserting redundancies into the program. This even includes SimplifyCFG.
OptimizePM.addPass(SpeculateAroundPHIsPass());		OptimizePM.addPass(SpeculateAroundPHIsPass());

if (PTO.Coroutines)		if (PTO.Coroutines) {
		OptimizePM.addPass(CoroElidePass());
OptimizePM.addPass(CoroCleanupPass());		OptimizePM.addPass(CoroCleanupPass());
		}

// Add the core optimizing pipeline.		// Add the core optimizing pipeline.
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(OptimizePM)));		MPM.addPass(createModuleToFunctionPassAdaptor(std::move(OptimizePM)));

for (auto &C : OptimizerLastEPCallbacks)		for (auto &C : OptimizerLastEPCallbacks)
C(MPM, Level);		C(MPM, Level);

if (PTO.CallGraphProfile)		if (PTO.CallGraphProfile)
▲ Show 20 Lines • Show All 1,672 Lines • Show Last 20 Lines

llvm/lib/Transforms/Coroutines/CoroSplit.cpp

Show First 20 Lines • Show All 1,017 Lines • ▼ Show 20 Lines	static void updateCoroFrame(coro::Shape &Shape, Function *ResumeFn,
}		}

auto *DestroyAddr = Builder.CreateStructGEP(		auto *DestroyAddr = Builder.CreateStructGEP(
Shape.FrameTy, Shape.FramePtr, coro::Shape::SwitchFieldIndex::Destroy,		Shape.FrameTy, Shape.FramePtr, coro::Shape::SwitchFieldIndex::Destroy,
"destroy.addr");		"destroy.addr");
Builder.CreateStore(DestroyOrCleanupFn, DestroyAddr);		Builder.CreateStore(DestroyOrCleanupFn, DestroyAddr);
}		}

static void postSplitCleanup(Function &F) {
removeUnreachableBlocks(F);

// For now, we do a mandatory verification step because we don't
// entirely trust this pass. Note that we don't want to add a verifier
// pass to FPM below because it will also verify all the global data.
if (verifyFunction(F, &errs()))
ChuanqiXuUnsubmitted Not Done Reply Inline Actions Is it necessary to remain the verify process? ChuanqiXu: Is it necessary to remain the verify process?
rjmccallUnsubmitted Not Done Reply Inline Actions It's probably okay for it to go away now. rjmccall: It's probably okay for it to go away now.
ChuanqiXuUnsubmitted Not Done Reply Inline Actions I prefer to remain the verification personally since it had reported many errors for me :). ChuanqiXu: I prefer to remain the verification personally since it had reported many errors for me :).
report_fatal_error("Broken function");

legacy::FunctionPassManager FPM(F.getParent());

FPM.add(createSCCPPass());
FPM.add(createCFGSimplificationPass());
FPM.add(createEarlyCSEPass());
FPM.add(createCFGSimplificationPass());

FPM.doInitialization();
FPM.run(F);
FPM.doFinalization();
}

// Assuming we arrived at the block NewBlock from Prev instruction, store		// Assuming we arrived at the block NewBlock from Prev instruction, store
// PHI's incoming values in the ResolvedValues map.		// PHI's incoming values in the ResolvedValues map.
static void		static void
scanPHIsAndUpdateValueMap(Instruction Prev, BasicBlock NewBlock,		scanPHIsAndUpdateValueMap(Instruction Prev, BasicBlock NewBlock,
DenseMap<Value , Value > &ResolvedValues) {		DenseMap<Value , Value > &ResolvedValues) {
auto *PrevBB = Prev->getParent();		auto *PrevBB = Prev->getParent();
for (PHINode &PN : NewBlock->phis()) {		for (PHINode &PN : NewBlock->phis()) {
auto V = PN.getIncomingValueForBlock(PrevBB);		auto V = PN.getIncomingValueForBlock(PrevBB);
▲ Show 20 Lines • Show All 326 Lines • ▼ Show 20 Lines	static void splitSwitchCoroutine(Function &F, coro::Shape &Shape,
createResumeEntryBlock(F, Shape);		createResumeEntryBlock(F, Shape);
auto ResumeClone = createClone(F, ".resume", Shape,		auto ResumeClone = createClone(F, ".resume", Shape,
CoroCloner::Kind::SwitchResume);		CoroCloner::Kind::SwitchResume);
auto DestroyClone = createClone(F, ".destroy", Shape,		auto DestroyClone = createClone(F, ".destroy", Shape,
CoroCloner::Kind::SwitchUnwind);		CoroCloner::Kind::SwitchUnwind);
auto CleanupClone = createClone(F, ".cleanup", Shape,		auto CleanupClone = createClone(F, ".cleanup", Shape,
CoroCloner::Kind::SwitchCleanup);		CoroCloner::Kind::SwitchCleanup);

postSplitCleanup(*ResumeClone);		removeUnreachableBlocks(*ResumeClone);
postSplitCleanup(*DestroyClone);		removeUnreachableBlocks(*DestroyClone);
postSplitCleanup(*CleanupClone);		removeUnreachableBlocks(*CleanupClone);

addMustTailToCoroResumes(*ResumeClone);		addMustTailToCoroResumes(*ResumeClone);

// Store addresses resume/destroy/cleanup functions in the coroutine frame.		// Store addresses resume/destroy/cleanup functions in the coroutine frame.
updateCoroFrame(Shape, ResumeClone, DestroyClone, CleanupClone);		updateCoroFrame(Shape, ResumeClone, DestroyClone, CleanupClone);

assert(Clones.empty());		assert(Clones.empty());
Clones.push_back(ResumeClone);		Clones.push_back(ResumeClone);
▲ Show 20 Lines • Show All 323 Lines • ▼ Show 20 Lines
static void		static void
updateCallGraphAfterCoroutineSplit(Function &F, const coro::Shape &Shape,		updateCallGraphAfterCoroutineSplit(Function &F, const coro::Shape &Shape,
const SmallVectorImpl<Function *> &Clones,		const SmallVectorImpl<Function *> &Clones,
CallGraph &CG, CallGraphSCC &SCC) {		CallGraph &CG, CallGraphSCC &SCC) {
if (!Shape.CoroBegin)		if (!Shape.CoroBegin)
return;		return;

removeCoroEnds(Shape, &CG);		removeCoroEnds(Shape, &CG);
postSplitCleanup(F);		removeUnreachableBlocks(F);
		aeubanksUnsubmitted Not Done Reply Inline Actions this could regress coroutines under the legacy PM behavior since we don't run `postSplitCleanup()` here? but maybe we don't care about the legacy PM so much anymore. aeubanks: this could regress coroutines under the legacy PM behavior since we don't run `postSplitCleanup…
		lxfindAuthorUnsubmitted Done Reply Inline Actions Not really. CoroSplit for legacy PM works in a similar way. Function simplifications happen after CoroSplit (if not O0) as well. lxfind: Not really. CoroSplit for legacy PM works in a similar way. Function simplifications happen…

// Update call graph and add the functions we created to the SCC.		// Update call graph and add the functions we created to the SCC.
coro::updateCallGraph(F, Clones, CG, SCC);		coro::updateCallGraph(F, Clones, CG, SCC);
}		}

static void updateCallGraphAfterCoroutineSplit(		static void updateCallGraphAfterCoroutineSplit(
LazyCallGraph::Node &N, const coro::Shape &Shape,		LazyCallGraph::Node &N, const coro::Shape &Shape,
const SmallVectorImpl<Function *> &Clones, LazyCallGraph::SCC &C,		const SmallVectorImpl<Function *> &Clones, LazyCallGraph::SCC &C,
LazyCallGraph &CG, CGSCCAnalysisManager &AM, CGSCCUpdateResult &UR,		LazyCallGraph &CG, CGSCCAnalysisManager &AM, CGSCCUpdateResult &UR,
FunctionAnalysisManager &FAM) {		FunctionAnalysisManager &FAM) {
if (!Shape.CoroBegin)		if (!Shape.CoroBegin)
return;		return;

for (llvm::AnyCoroEndInst *End : Shape.CoroEnds) {		for (llvm::AnyCoroEndInst *End : Shape.CoroEnds) {
auto &Context = End->getContext();		auto &Context = End->getContext();
End->replaceAllUsesWith(ConstantInt::getFalse(Context));		End->replaceAllUsesWith(ConstantInt::getFalse(Context));
End->eraseFromParent();		End->eraseFromParent();
}		}

		removeUnreachableBlocks(N.getFunction());

if (!Clones.empty()) {		if (!Clones.empty()) {
switch (Shape.ABI) {		switch (Shape.ABI) {
case coro::ABI::Switch:		case coro::ABI::Switch:
// Each clone in the Switch lowering is independent of the other clones.		// Each clone in the Switch lowering is independent of the other clones.
// Let the LazyCallGraph know about each one separately.		// Let the LazyCallGraph know about each one separately.
for (Function *Clone : Clones)		for (Function *Clone : Clones)
CG.addSplitFunction(N.getFunction(), *Clone);		CG.addSplitFunction(N.getFunction(), *Clone);
break;		break;
case coro::ABI::Async:		case coro::ABI::Async:
case coro::ABI::Retcon:		case coro::ABI::Retcon:
case coro::ABI::RetconOnce:		case coro::ABI::RetconOnce:
// Each clone in the Async/Retcon lowering references of the other clones.		// Each clone in the Async/Retcon lowering references of the other clones.
// Let the LazyCallGraph know about all of them at once.		// Let the LazyCallGraph know about all of them at once.
CG.addSplitRefRecursiveFunctions(N.getFunction(), Clones);		CG.addSplitRefRecursiveFunctions(N.getFunction(), Clones);
break;		break;
}		}

// Let the CGSCC infra handle the changes to the original function.		// Let the CGSCC infra handle the changes to the original function.
updateCGAndAnalysisManagerForCGSCCPass(CG, C, N, AM, UR, FAM);		updateCGAndAnalysisManagerForCGSCCPass(CG, C, N, AM, UR, FAM);
}		}

// Do some cleanup and let the CGSCC infra see if we've cleaned up any edges
// to the split functions.
postSplitCleanup(N.getFunction());
updateCGAndAnalysisManagerForFunctionPass(CG, C, N, AM, UR, FAM);
}		}

// When we see the coroutine the first time, we insert an indirect call to a		// When we see the coroutine the first time, we insert an indirect call to a
// devirt trigger function and mark the coroutine that it is now ready for		// devirt trigger function and mark the coroutine that it is now ready for
// split.		// split.
// Async lowering uses this after it has split the function to restart the		// Async lowering uses this after it has split the function to restart the
// pipeline.		// pipeline.
static void prepareForSplit(Function &F, CallGraph &CG,		static void prepareForSplit(Function &F, CallGraph &CG,
▲ Show 20 Lines • Show All 224 Lines • ▼ Show 20 Lines	for (auto *PrepareFn : PrepareFns) {
replaceAllPrepares(PrepareFn, CG, C);		replaceAllPrepares(PrepareFn, CG, C);
}		}
}		}

// Split all the coroutines.		// Split all the coroutines.
for (LazyCallGraph::Node *N : Coroutines) {		for (LazyCallGraph::Node *N : Coroutines) {
Function &F = N->getFunction();		Function &F = N->getFunction();
Attribute Attr = F.getFnAttribute(CORO_PRESPLIT_ATTR);		Attribute Attr = F.getFnAttribute(CORO_PRESPLIT_ATTR);
StringRef Value = Attr.getValueAsString();		StringRef Value = Attr.getValueAsString();
LLVM_DEBUG(dbgs() << "CoroSplit: Processing coroutine '" << F.getName()		LLVM_DEBUG(dbgs() << "CoroSplit: Processing coroutine '" << F.getName()
<< "' state: " << Value << "\n");		<< "' state: " << Value << "\n");
if (Value == UNPREPARED_FOR_SPLIT) {
// Enqueue a second iteration of the CGSCC pipeline on this SCC.
UR.CWorklist.insert(&C);
F.addFnAttr(CORO_PRESPLIT_ATTR, PREPARED_FOR_SPLIT);
continue;
}
F.removeFnAttr(CORO_PRESPLIT_ATTR);		F.removeFnAttr(CORO_PRESPLIT_ATTR);
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions These are not needed. ChuanqiXu: These are not needed.
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions Refactor this into: LLVM_DEBUG(dbgs() << "CoroSplit: Processing coroutine '" << F.getName() << "' state: " << Attr.getValueAsString() << "\n"); could erase an warning in release build. ChuanqiXu: Refactor this into: ``` LLVM_DEBUG(dbgs() << "CoroSplit: Processing coroutine '" << F.getName()…
		lxfindAuthorUnsubmitted Done Reply Inline Actions Good catch. How did you catch this? It seems like I don't see warning on my Mac by default. lxfind: Good catch. How did you catch this? It seems like I don't see warning on my Mac by default.
		ychenUnsubmitted Not Done Reply Inline Actions drive-by nit: would it be better to move it up near `Coroutines.push_back(&N);`? ychen: drive-by nit: would it be better to move it up near `Coroutines.push_back(&N);`?

SmallVector<Function *, 4> Clones;		SmallVector<Function *, 4> Clones;
const coro::Shape Shape = splitCoroutine(F, Clones, ReuseFrameSlot);		const coro::Shape Shape = splitCoroutine(F, Clones, ReuseFrameSlot);
updateCallGraphAfterCoroutineSplit(*N, Shape, Clones, C, CG, AM, UR, FAM);		updateCallGraphAfterCoroutineSplit(*N, Shape, Clones, C, CG, AM, UR, FAM);

if ((Shape.ABI == coro::ABI::Async \|\| Shape.ABI == coro::ABI::Retcon \|\|		if (!Shape.CoroSuspends.empty()) {
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions I am not familiar with the Shape.ABI other than coro::ABI:switch. But the diff line seems strange, it looks like that condition gets weaker. ChuanqiXu: I am not familiar with the Shape.ABI other than coro::ABI:switch. But the diff line seems…
		aeubanksUnsubmitted Not Done Reply Inline Actions I believe that's intentional, and a big part of this patch. We want to re-add the current SCC (and the split SCCs) any time we split an SCC. Before we weren't properly doing that. aeubanks: I believe that's intentional, and a big part of this patch. We want to re-add the current SCC…
		ychenUnsubmitted Not Done Reply Inline Actions I got your point. So "// All clones will be in the same RefSCC ...." : this is not accurate I think? ychen: I got your point. So "// All clones will be in the same RefSCC ...." : this is not accurate I…
		lxfindAuthorUnsubmitted Done Reply Inline Actions Note that previously this is done only for Async, Retcon and RetconOnce ABIs, not for the Switch ABI. I guess that's accurate for those ABIs? But for Switch ABI this is not true. And before we were not adding back the split functions to the pipeline to be properly optimized. Now we are dong that. This should help improve the performance of the post-split functions. lxfind: Note that previously this is done only for Async, Retcon and RetconOnce ABIs, not for the…
		ychenUnsubmitted Not Done Reply Inline Actions I missed the ABI condition. Thanks for the explanation. ychen: I missed the ABI condition. Thanks for the explanation.
Shape.ABI == coro::ABI::RetconOnce) &&		// Run the CGSCC pipeline on the original and newly split functions.
!Shape.CoroSuspends.empty()) {		UR.CWorklist.insert(&C);
// Run the CGSCC pipeline on the newly split functions.		for (Function *Clone : Clones)
		aeubanksUnsubmitted Not Done Reply Inline Actions are all `Clones` guaranteed to be in different SCC? aeubanks: are all `Clones` guaranteed to be in different SCC?
		lxfindAuthorUnsubmitted Done Reply Inline Actions No. But CWorklist is a map so redundant insertions won't happen. lxfind: No. But CWorklist is a map so redundant insertions won't happen.
		aeubanksUnsubmitted Not Done Reply Inline Actions Oh I didn't know that, thanks! aeubanks: Oh I didn't know that, thanks!
// All clones will be in the same RefSCC, so choose a random clone.		UR.CWorklist.insert(CG.lookupSCC(CG.get(*Clone)));
UR.RCWorklist.insert(CG.lookupRefSCC(CG.get(*Clones[0])));
}		}
}		}

if (!PrepareFns.empty()) {		if (!PrepareFns.empty()) {
for (auto *PrepareFn : PrepareFns) {		for (auto *PrepareFn : PrepareFns) {
replaceAllPrepares(PrepareFn, CG, C);		replaceAllPrepares(PrepareFn, CG, C);
}		}
}		}
▲ Show 20 Lines • Show All 122 Lines • Show Last 20 Lines

llvm/test/Transforms/Coroutines/coro-alloc-with-param-O0.ll

Show All 28 Lines	suspend:
ret i8* %hdl		ret i8* %hdl
}		}

; See if %this was added to the frame		; See if %this was added to the frame
; CHECK: %f_copy.Frame = type { void (%f_copy.Frame), void (%f_copy.Frame), i64, i1 }		; CHECK: %f_copy.Frame = type { void (%f_copy.Frame), void (%f_copy.Frame), i64, i1 }

; See that %this is spilled into the frame		; See that %this is spilled into the frame
; CHECK-LABEL: define i8* @f_copy(i64 %this_arg)		; CHECK-LABEL: define i8* @f_copy(i64 %this_arg)
		; CHECK: %this.addr = alloca i64, align 8
		; CHECK: store i64 %this_arg, i64* %this.addr, align 4
		; CHECK: %this = load i64, i64* %this.addr, align 4
; CHECK: %this.spill.addr = getelementptr inbounds %f_copy.Frame, %f_copy.Frame* %FramePtr, i32 0, i32 2		; CHECK: %this.spill.addr = getelementptr inbounds %f_copy.Frame, %f_copy.Frame* %FramePtr, i32 0, i32 2
; CHECK: store i64 %this_arg, i64* %this.spill.addr		; CHECK: store i64 %this, i64* %this.spill.addr
; CHECK: ret i8* %hdl		; CHECK: ret i8* %hdl

; See that %this was loaded from the frame		; See that %this was loaded from the frame
; CHECK-LABEL: @f_copy.resume(		; CHECK-LABEL: @f_copy.resume(
; CHECK: %this.reload = load i64, i64* %this.reload.addr		; CHECK: %this.reload = load i64, i64* %this.reload.addr
; CHECK: call void @print2(i64 %this.reload)		; CHECK: call void @print2(i64 %this.reload)
; CHECK: ret void		; CHECK: ret void

declare i8* @llvm.coro.free(token, i8*)		declare i8* @llvm.coro.free(token, i8*)
declare i32 @llvm.coro.size.i32()		declare i32 @llvm.coro.size.i32()
declare i8 @llvm.coro.suspend(token, i1)		declare i8 @llvm.coro.suspend(token, i1)
declare void @llvm.coro.resume(i8*)		declare void @llvm.coro.resume(i8*)
declare void @llvm.coro.destroy(i8*)		declare void @llvm.coro.destroy(i8*)

declare token @llvm.coro.id(i32, i8, i8, i8*)		declare token @llvm.coro.id(i32, i8, i8, i8*)
declare i1 @llvm.coro.alloc(token)		declare i1 @llvm.coro.alloc(token)
declare i8* @llvm.coro.begin(token, i8*)		declare i8* @llvm.coro.begin(token, i8*)
declare i1 @llvm.coro.end(i8*, i1)		declare i1 @llvm.coro.end(i8*, i1)

declare noalias i8* @myAlloc(i64, i32)		declare noalias i8* @myAlloc(i64, i32)
declare double @print(double)		declare double @print(double)
declare void @print2(i64)		declare void @print2(i64)
declare void @free(i8*)		declare void @free(i8*)

llvm/test/Transforms/Coroutines/coro-alloca-01.ll

	Show All 40 Lines
	}			}

	; both %x and %y, as well as %alias_phi would all go to the frame.			; both %x and %y, as well as %alias_phi would all go to the frame.
	; CHECK: %f.Frame = type { void (%f.Frame), void (%f.Frame), i64, i64, i32*, i1 }			; CHECK: %f.Frame = type { void (%f.Frame), void (%f.Frame), i64, i64, i32*, i1 }
	; CHECK-LABEL: @f(			; CHECK-LABEL: @f(
	; CHECK: %x.reload.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 2			; CHECK: %x.reload.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 2
	; CHECK: %y.reload.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 3			; CHECK: %y.reload.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 3
	; CHECK: %x.alias = bitcast i64* %x.reload.addr to i32*			; CHECK: %x.alias = bitcast i64* %x.reload.addr to i32*
				; CHECK: %x.alias.merge = phi i32* [ %x.alias, %flag_true ]
	; CHECK: %y.alias = bitcast i64* %y.reload.addr to i32*			; CHECK: %y.alias = bitcast i64* %y.reload.addr to i32*
	; CHECK: %alias_phi = select i1 %n, i32* %x.alias, i32* %y.alias			; CHECK: %y.alias.merge = phi i32* [ %y.alias, %flag_false ]
				; CHECK: %alias_phi = phi i32* [ %x.alias.merge, %merge.from.flag_true ], [ %y.alias.merge, %merge.from.flag_false ]
	; CHECK: %alias_phi.spill.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 4			; CHECK: %alias_phi.spill.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 4
	; CHECK: store i32* %alias_phi, i32** %alias_phi.spill.addr, align 8			; CHECK: store i32* %alias_phi, i32** %alias_phi.spill.addr, align 8

	declare i8* @llvm.coro.free(token, i8*)			declare i8* @llvm.coro.free(token, i8*)
	declare i32 @llvm.coro.size.i32()			declare i32 @llvm.coro.size.i32()
	declare i8 @llvm.coro.suspend(token, i1)			declare i8 @llvm.coro.suspend(token, i1)
	declare void @llvm.coro.resume(i8*)			declare void @llvm.coro.resume(i8*)
	declare void @llvm.coro.destroy(i8*)			declare void @llvm.coro.destroy(i8*)
	Show All 9 Lines

llvm/test/Transforms/Coroutines/coro-alloca-04.ll

	Show All 39 Lines
	}			}

	; both %x and %alias_phi would go to the frame.			; both %x and %alias_phi would go to the frame.
	; CHECK: %f.Frame = type { void (%f.Frame), void (%f.Frame), i64, i32*, i1 }			; CHECK: %f.Frame = type { void (%f.Frame), void (%f.Frame), i64, i32*, i1 }
	; CHECK-LABEL: @f(			; CHECK-LABEL: @f(
	; CHECK: store void (%f.Frame) @f.destroy, void (%f.Frame)* %destroy.addr			; CHECK: store void (%f.Frame) @f.destroy, void (%f.Frame)* %destroy.addr
	; CHECK-NEXT: %0 = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 2			; CHECK-NEXT: %0 = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 2
	; CHECK-NEXT: %1 = bitcast i64* %0 to i8*			; CHECK-NEXT: %1 = bitcast i64* %0 to i8*
	; CHECK-NEXT: %2 = bitcast i8* %1 to i32*			; CHECK-NEXT: %2 = getelementptr i8, i8* %1, i64 0
	; CHECK-NEXT: %alias_phi.spill.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 3			; CHECK-NEXT: %3 = bitcast i8* %2 to i32*
	; CHECK-NEXT: store i32* %2, i32** %alias_phi.spill.addr			; CHECK: %alias_phi.spill.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 3
				; CHECK-NEXT: store i32* %3, i32** %alias_phi.spill.addr

	declare i8* @llvm.coro.free(token, i8*)			declare i8* @llvm.coro.free(token, i8*)
	declare i32 @llvm.coro.size.i32()			declare i32 @llvm.coro.size.i32()
	declare i8 @llvm.coro.suspend(token, i1)			declare i8 @llvm.coro.suspend(token, i1)
	declare void @llvm.coro.resume(i8*)			declare void @llvm.coro.resume(i8*)
	declare void @llvm.coro.destroy(i8*)			declare void @llvm.coro.destroy(i8*)

	declare token @llvm.coro.id(i32, i8, i8, i8*)			declare token @llvm.coro.id(i32, i8, i8, i8*)
	declare i1 @llvm.coro.alloc(token)			declare i1 @llvm.coro.alloc(token)
	declare i8* @llvm.coro.begin(token, i8*)			declare i8* @llvm.coro.begin(token, i8*)
	declare i1 @llvm.coro.end(i8*, i1)			declare i1 @llvm.coro.end(i8*, i1)

	declare void @print(i32*)			declare void @print(i32*)
	declare noalias i8* @malloc(i32)			declare noalias i8* @malloc(i32)
	declare void @free(i8*)			declare void @free(i8*)

llvm/test/Transforms/Coroutines/coro-alloca-05.ll

Show All 26 Lines	suspend:
call i1 @llvm.coro.end(i8* %hdl, i1 0)		call i1 @llvm.coro.end(i8* %hdl, i1 0)
ret i8* %hdl		ret i8* %hdl
}		}

; CHECK-LABEL: @f.resume(		; CHECK-LABEL: @f.resume(
; CHECK-NEXT: entry.resume:		; CHECK-NEXT: entry.resume:
; CHECK-NEXT: [[VFRAME:%.]] = bitcast %f.Frame [[FRAMEPTR:%.]] to i8		; CHECK-NEXT: [[VFRAME:%.]] = bitcast %f.Frame [[FRAMEPTR:%.]] to i8
; CHECK-NEXT: [[X:%.*]] = alloca i32, align 4		; CHECK-NEXT: [[X:%.*]] = alloca i32, align 4
; CHECK-NEXT: [[X_VALUE:%.]] = load i32, i32 [[X]], align 4		; CHECK: [[X_VALUE:%.]] = load i32, i32 [[X]], align 4
; CHECK-NEXT: call void @print(i32 [[X_VALUE]])		; CHECK-NEXT: call void @print(i32 [[X_VALUE]])
; CHECK-NEXT: call void @free(i8* [[VFRAME]])		; CHECK: call void @free(i8* [[VFRAME]])
; CHECK-NEXT: ret void		; CHECK: ret void

declare i8* @llvm.coro.free(token, i8*)		declare i8* @llvm.coro.free(token, i8*)
declare i32 @llvm.coro.size.i32()		declare i32 @llvm.coro.size.i32()
declare i8 @llvm.coro.suspend(token, i1)		declare i8 @llvm.coro.suspend(token, i1)
declare void @llvm.coro.resume(i8*)		declare void @llvm.coro.resume(i8*)
declare void @llvm.coro.destroy(i8*)		declare void @llvm.coro.destroy(i8*)

declare token @llvm.coro.id(i32, i8, i8, i8*)		declare token @llvm.coro.id(i32, i8, i8, i8*)
declare i1 @llvm.coro.alloc(token)		declare i1 @llvm.coro.alloc(token)
declare i8* @llvm.coro.begin(token, i8*)		declare i8* @llvm.coro.begin(token, i8*)
declare i1 @llvm.coro.end(i8*, i1)		declare i1 @llvm.coro.end(i8*, i1)

declare void @print(i32)		declare void @print(i32)
declare noalias i8* @malloc(i32)		declare noalias i8* @malloc(i32)
declare void @free(i8*)		declare void @free(i8*)

llvm/test/Transforms/Coroutines/restart-trigger.ll

	; Verifies that the restart trigger that is used by legacy coroutine passes			; Verifies that the restart trigger that is used by legacy coroutine passes
	; forces the legacy pass manager to restart IPO pipelines, thereby causing the			; forces the legacy pass manager to restart IPO pipelines, thereby causing the
	; same coroutine to be looked at by CoroSplit pass twice.			; same coroutine to be looked at by CoroSplit pass twice.
	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: opt < %s -S -O0 -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck %s			; RUN: opt < %s -S -O0 -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck %s
	; RUN: opt < %s -S -O1 -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck %s			; RUN: opt < %s -S -O1 -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck %s
	; The following tests use the new pass manager, and verify that the coroutine			; The following tests use the new pass manager, and verify that the coroutine
	; passes re-run the CGSCC pipeline.			; passes re-run the CGSCC pipeline.
	; RUN: opt < %s -S -passes='default<O0>' -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck %s			; RUN: opt < %s -S -passes='default<O0>' -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck --check-prefix=CHECK-NEWPM %s
	; RUN: opt < %s -S -passes='default<O1>' -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck %s			; RUN: opt < %s -S -passes='default<O1>' -enable-coroutines -debug-only=coro-split 2>&1 \| FileCheck --check-prefix=CHECK-NEWPM %s

	; CHECK: CoroSplit: Processing coroutine 'f' state: 0			; CHECK: CoroSplit: Processing coroutine 'f' state: 0
	; CHECK-NEXT: CoroSplit: Processing coroutine 'f' state: 1			; CHECK-NEXT: CoroSplit: Processing coroutine 'f' state: 1
				; CHECK-NEWPM: CoroSplit: Processing coroutine 'f' state: 0
				; CHECK-NEWPM-NOT: CoroSplit: Processing coroutine 'f' state: 1


	define void @f() {			define void @f() {
	%id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)			%id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
	%size = call i32 @llvm.coro.size.i32()			%size = call i32 @llvm.coro.size.i32()
	%alloc = call i8* @malloc(i32 %size)			%alloc = call i8* @malloc(i32 %size)
	%hdl = call i8* @llvm.coro.begin(token %id, i8* %alloc)			%hdl = call i8* @llvm.coro.begin(token %id, i8* %alloc)
	call void @print(i32 0)			call void @print(i32 0)
	%s1 = call i8 @llvm.coro.suspend(token none, i1 false)			%s1 = call i8 @llvm.coro.suspend(token none, i1 false)
	Show All 27 Lines