This is an archive of the discontinued LLVM Phabricator instance.

[Coroutines][6/6] Clang schedules new passes
ClosedPublic

Authored by modocache on Dec 26 2019, 6:28 AM.

Details

Summary

Depends on https://reviews.llvm.org/D71902.

The last in a series of six patches that ports the LLVM coroutines
passes to the new pass manager infrastructure.

This patch has Clang schedule the new coroutines passes when the
-fexperimental-new-pass-manager option is used. With this and the
previous 5 patches, Clang is capable of building and successfully
running the test suite of large coroutines projects such as
https://github.com/lewissbaker/cppcoro with
ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=On.

Event Timeline

modocache created this revision.Dec 26 2019, 6:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 26 2019, 6:28 AM
junparser added inline comments.Dec 26 2019, 9:38 PM
clang/lib/CodeGen/BackendUtil.cpp
1240

Since coro elision depends on other optimization pass(inline and so on) implicitly, how can we adjust the pipeline to achieve this.

modocache marked an inline comment as done.Dec 27 2019, 9:44 AM
modocache added inline comments.
clang/lib/CodeGen/BackendUtil.cpp
1240

One option would be to use the new pass manager's registration callbacks, like PassBuilder::registerPipelineStartEPCallback or PassBuilder::registerOptimizerLastEPCallback. These work similarly to the old pass manager's PassManagerBuilder::addExtension. That's something that I think would be good to improve in a follow-up patch, but let me know if you'd rather see it in this one.

There is another issue we should consider: clang is crashed when compile coroutine with -disable-llvm-passes and output an object file.
Is it reasonable to run coroutine passes even -disable-llvm-passes is enabled?

clang/lib/CodeGen/BackendUtil.cpp
1240

yes, please. It should be done in this patch sets.

wenlei added a subscriber: wenlei.Dec 30 2019, 12:37 PM
wenlei added inline comments.
clang/lib/CodeGen/BackendUtil.cpp
1241

Manually scheduling the 2nd coro-split passes in the same CGSCC pipeline would make the resume/suspend funclet ineligible for the bulk of CSGSS opts, given the split point is relatively early. The implication would be discouraging the use of coroutine in performance critical path. I would love to see this being addressed before we claim coroutine is ready for new PM.

As commented in the 2nd patch, we may not need the devirt trick used with legacy PM, instead for new PM, we could try to leverage CGSCCUpdateResult without involving artificial indirect call and devirt (or follow how outlining is handled by new PM).

modocache planned changes to this revision.Jan 1 2020, 7:25 PM
modocache marked 2 inline comments as done and an inline comment as not done.
modocache added inline comments.
clang/lib/CodeGen/BackendUtil.cpp
1240

Will do!

1241

I would love to see this being addressed before we claim coroutine is ready for new PM.

I apologize, since I didn't intend to make such a claim. In http://lists.llvm.org/pipermail/llvm-dev/2019-December/137835.html I explained that these 6 patches were focused on lowering coroutine intrinsics. My goal was to resolve https://bugs.llvm.org/show_bug.cgi?id=42867, so that C++ coroutines code didn't trigger a fatal error in ISel.

That being said, I'm happy to make changes here. I'll send updates for D71899 and this patch.

I'm currently working on ensuring that CGSCC optimizations are rerun to optimize coroutine funclets -- the primary feedback I received on this and on D71899 -- but I just realized I didn't respond to one comment on this set of reviews, from @junparser:

There is another issue we should consider: clang is crashed when compile coroutine with -disable-llvm-passes and output an object file.

It's always been the case, since the coroutine intrinsics and passes were first added to LLVM, that attempting to codegen without first running coroutine passes would cause a crash during instruction selection. So clang -Xclang -disable-llvm-passes -c has always crashed Clang during LLVM ISel, as it does in this example that uses Clang 9.0.0 and the legacy pass manager: https://godbolt.org/z/Mj2R5G

Personally I'm of the opinion that this is less than ideal... I may be wrong, but I don't think there are very many other C++ features that *require* Clang to run LLVM passes (perhaps the always_inline attribute requires LLVM passes to be run for correctness? I'm not sure). So I would like to see this eventually addressed somehow.

Is it reasonable to run coroutine passes even -disable-llvm-passes is enabled?

My personal opinion is that this would not be reasonable. The option -disable-llvm-passes should, from my point of view, prevent any and all LLVM passes from being run. I also frequently make use of this option when debugging the LLVM IR being output for C++ coroutines code, so if -disable-llvm-passes didn't disable coroutines passes, I'd need another option that did (-disable-llvm-passes-no-really-even-coroutine-passes-them-too 😅).

All this being said, considering this behavior has existed in the legacy PM since day one, I think we should start a separate discussion on if/how to change that behavior. I'm working on an update for these patches to address funclet optimization, but the update will not change the fact that coroutine passes are not run when -disable-llvm-passes is specified. I think that's an orthogonal issue.

I'm currently working on ensuring that CGSCC optimizations are rerun to optimize coroutine funclets -- the primary feedback I received on this and on D71899 -- but I just realized I didn't respond to one comment on this set of reviews, from @junparser:

There is another issue we should consider: clang is crashed when compile coroutine with -disable-llvm-passes and output an object file.

It's always been the case, since the coroutine intrinsics and passes were first added to LLVM, that attempting to codegen without first running coroutine passes would cause a crash during instruction selection. So clang -Xclang -disable-llvm-passes -c has always crashed Clang during LLVM ISel, as it does in this example that uses Clang 9.0.0 and the legacy pass manager: https://godbolt.org/z/Mj2R5G

Personally I'm of the opinion that this is less than ideal... I may be wrong, but I don't think there are very many other C++ features that *require* Clang to run LLVM passes (perhaps the always_inline attribute requires LLVM passes to be run for correctness? I'm not sure). So I would like to see this eventually addressed somehow.

Is it reasonable to run coroutine passes even -disable-llvm-passes is enabled?

My personal opinion is that this would not be reasonable. The option -disable-llvm-passes should, from my point of view, prevent any and all LLVM passes from being run. I also frequently make use of this option when debugging the LLVM IR being output for C++ coroutines code, so if -disable-llvm-passes didn't disable coroutines passes, I'd need another option that did (-disable-llvm-passes-no-really-even-coroutine-passes-them-too 😅).

All this being said, considering this behavior has existed in the legacy PM since day one, I think we should start a separate discussion on if/how to change that behavior. I'm working on an update for these patches to address funclet optimization, but the update will not change the fact that coroutine passes are not run when -disable-llvm-passes is specified. I think that's an orthogonal issue.

make sense to me, thanks!

modocache updated this revision to Diff 236245.Jan 5 2020, 5:49 AM
modocache removed a subscriber: wenlei.

Thanks for the reviews. Based on my latest revision of D71899, the coro-split diff, this patch now no longer manually enqueues coro-split twice. In addition, it uses the PassBuilder registration callbacks to enqueue coroutine passes in appropriate spots in the pipeline. coro-split now restarts the entire MainCGPipeline that's built as part of module optimization. As discussed, the coroutine passes are enqueued at -O0, and they're not enqueued when -disable-llvm-passes is specified.

Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2020, 5:49 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
modocache updated this revision to Diff 236889.Jan 8 2020, 12:41 PM

Update tests -- we now re-run the SCC pass, but don't insert the coroutine funclets into the SCC, so we no longer see the funclets in the output being tested here.

modocache updated this revision to Diff 236967.Jan 8 2020, 10:24 PM

Initialize PipelineTuningOptions properly.

modocache updated this revision to Diff 244566.Feb 13 2020, 8:22 PM

Clean up coroutine intrinsics as part of the ThinLTO pre-link pipeline.

modocache updated this revision to Diff 245054.Feb 17 2020, 4:02 PM
modocache removed a reviewer: wenlei.
modocache removed a project: Restricted Project.
modocache removed subscribers: wenlei, hiraditya.

Rebase on top of the latest version of D71902.

modocache added subscribers: wenlei, hiraditya.
wenlei accepted this revision.Feb 17 2020, 8:40 PM
This revision is now accepted and ready to land.Feb 17 2020, 8:40 PM
This revision was automatically updated to reflect the committed changes.