This is an archive of the discontinued LLVM Phabricator instance.

[Pipelines] Enable EarlyCSE after CoroCleanup to avoid runtime performance losses (5/5)
AbandonedPublic

Authored by ChuanqiXu on Apr 25 2022, 12:43 AM.

Details

Summary

After the previous patch landed, the compiler wouldn't optimize readnone function if we enabled coroutine. This is not good . This patch tries to fix the problem by enabling EarlyCSE pass after CoroSplit. I think this is the price we couldn't avoid.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Apr 25 2022, 12:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 12:43 AM
ChuanqiXu requested review of this revision.Apr 25 2022, 12:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 12:43 AM

would it make sense to put all of the coroutine lower passes right at the beginning of the pipeline? e.g. around LowerExpectIntrinsicPass? is there a reason CoroSplit is interleaved in the CGSCC pass manager?

I'd like to understand the constraints better before going forward with a solution that currently seems unprincipled to me

would it make sense to put all of the coroutine lower passes right at the beginning of the pipeline? e.g. around LowerExpectIntrinsicPass? is there a reason CoroSplit is interleaved in the CGSCC pass manager?

I'd like to understand the constraints better before going forward with a solution that currently seems unprincipled to me

there's also been some talk about building infrastructure to conditionally running passes if some other pass has done some transformation, but ideally we'd have a simpler solution here

would it make sense to put all of the coroutine lower passes right at the beginning of the pipeline? e.g. around LowerExpectIntrinsicPass? is there a reason CoroSplit is interleaved in the CGSCC pass manager?

I'd like to understand the constraints better before going forward with a solution that currently seems unprincipled to me

We couldn't forward CoroSplit pass. We would lose many optimization opportunities in that way. Here is the background:

(1) CoroSplit pass would split a coroutine into multiple functions.
(2) The LLVM compiler is much better at optimizing a function than interprocedural optimizations (IPO). In fact, the most effective IPO in LLVM now is inlining, which would enable function optimization after inlining another function.
(3) After CoroSplit, it is rare if we could get the original function by inlining.
(4) As a result, we could find that CoroSplit would break function level optimization opportunities and we couldn't save it by IPO.
(5) So we would try to run as many optimization passes before CoroSplit as possible.

So we couldn't run CoroSplit pass at the very beginning.

is there a reason CoroSplit is interleaved in the CGSCC pass manager?

From the statement above, we could know that it is better to run CoroSplit after inlining to inline calls in a coroutine to get better optimization chances. So the question is converted to "Why CoroSplit pass is in CGSCC pass instead of behind it". The reason here is that we couldn't inline a unlowered coroutine in another unlowered coroutine otherwise CoroSplit coulnd't handle it (due to the design of coroutine). Due to the potential inlining, the unlowered coroutines are not allowed to be inlined. But it is no harmful to inline a lowered coroutine. And it is a necessary to enable an optimization pass called CoroElide to inline a lowered coroutine into a unlowered coroutine. So here are the reasons we put CoroSplit in CGSCC passes:
(1) We need to run inlining before CoroSplit.
(2) We need to run inlining after CoroSplit too.

Hope I state things clear enough : )

ChuanqiXu planned changes to this revision.Apr 28 2022, 2:11 AM

Plan changes due to the previous patch changes its plan too