The revision tries to fix https://github.com/llvm/llvm-project/issues/47177. The key reason for this bug is similar to https://github.com/llvm/llvm-project/issues/47179. It is about the introduction of coroutine breaks the assumption that a function could only be executed in one thread. Also the fix is similar too. I add a wrapper for readnone function to block the optimizations. The main difference is that we missed optimization chances for readnone function if we enabled coroutine. And this would be addressed in following patch.
Diff Detail
Event Timeline
clang/lib/CodeGen/CGCall.cpp | ||
---|---|---|
2132 | This is a global setting and will affect every function in every file that has coroutines enabled, which is presumably every file compiled with -std=c++20 or later. So you're radically changing optimization for a ton of C++ code that doesn't use coroutines. It should be fine to just add coro_readnone instead of readnone to call sites from coroutine bodies, right? Oh, but I guess we could have a readnone call in an inlined function body. Can we just suppress this kind of code motion in coroutine bodies, or is that too invasive to the optimizer? Actually, maybe your change to this patch was a misunderstanding. What I was trying to say in Discourse was that I wasn't sure that your early pass approach would work for thread-local variables, because addresses of thread-locals are LLVM constants and can get moved around implicitly; you need the frontend to be involved there so that there's something hooked into the right place in the emitted function. But I think it should work for this readnone annotation. |
clang/lib/CodeGen/CGCall.cpp | ||
---|---|---|
2132 |
Yeah... I misunderstood your words. You were talking about TLS variable only but I thought you mentioned both.
Yeah... it is a little better in current version since it affects codes with coroutines only. So it wouldn't affect codes compiled with -std=c++20 or later if they don't use coroutines. For optimizations, I was imaging we could re-enable the optimization by running EarlyCSE pass after CoroCleanup pass in https://reviews.llvm.org/D124364. But as you said, the optimization would be changed in the proposal.
Yes. According to the previous discussion (https://discourse.llvm.org/t/rfc-coroutine-and-pthread-self/56985), people don't like the direction to insert checks for coroutines around the passes. It is a burden for other developers to understand and remember they may handle a coroutine.
I am trying to do that. I think we could replace coro_readnone with readnone when we are inlining a call into an unlowered coroutine. However, I meet a technical problem that it doesn't work if we don't remove readnone for the function declaration... I think we could go to https://reviews.llvm.org/D124361 first, it doesn't depend on this one and it wouldn't meet the optimization changing problem. |
clang/lib/CodeGen/CGCall.cpp | ||
---|---|---|
2132 |
replace readnone with coro_readnone |
This is a global setting and will affect every function in every file that has coroutines enabled, which is presumably every file compiled with -std=c++20 or later. So you're radically changing optimization for a ton of C++ code that doesn't use coroutines.
It should be fine to just add coro_readnone instead of readnone to call sites from coroutine bodies, right? Oh, but I guess we could have a readnone call in an inlined function body. Can we just suppress this kind of code motion in coroutine bodies, or is that too invasive to the optimizer?
Actually, maybe your change to this patch was a misunderstanding. What I was trying to say in Discourse was that I wasn't sure that your early pass approach would work for thread-local variables, because addresses of thread-locals are LLVM constants and can get moved around implicitly; you need the frontend to be involved there so that there's something hooked into the right place in the emitted function. But I think it should work for this readnone annotation.