- User Since
- May 3 2020, 4:37 PM (151 w, 4 d)
Nov 3 2021
I also agree that we should try to keep the compiler simple and not support the complicated case.
It should be fairly straightforward for a codebase to update fully to use std instead of std::experimental (we have a large coroutine codebase as well).
Given that everyone is mostly supportive, I will accept the change.
Oct 21 2021
Thank you for fixing this!
Sep 2 2021
Aug 26 2021
The warning seems redundant to the one in D108696?
Aug 25 2021
In D108696, the clang would lookup in std and std::experimental namespace. And this patch would only be checked in after D108696 get checked in. So I think it wouldn't affect users much.
Well, but if user code has "#include <experimental/coroutine>", would it still compile?
My guess is you might need to keep coro in both std and std::experimental for a while, to allow smooth transition.
Aug 24 2021
I am not familiar with the process of when to move something out of experimental, but I do wonder how this is normally done so that people who uses coroutines can have a smooth migration?
I assume that this is going to be a breaking change that existing code using coroutine will need to be updated and no longer compatible with old versions.
Aug 23 2021
Thanks for working on this!
A few questions:
- Is this primarily an optimization, or does it also fix a bug?
- Does LLVM mid-end not have a nothrow attribute analysis pass? Doing this manually in the front-end seems a bit fragile (concerned about missing something)
- Do we expect those isXXXNoThrow methods to be called else where? If not, is there a reason we choose to put them in 6 separate functions, instead of one function?
Aug 3 2021
Aug 2 2021
Hi! I have a question regarding the implementation of "VisitMaterializeTemporaryExpr". Specifically, I wonder if we should skip visiting the children? Would't visiting the children of MaterializeTemporaryExpr cause the same expression to be visited twice?
Jul 30 2021
Jul 29 2021
Having both "--show-mmap-events --ignore-mmap-events" seems a bit counter-intuitive.
If we already preprocessed the mmap events, could we also assume that we have the right pids and hence don't need --show-mmap-events?
Jul 26 2021
Jul 25 2021
Jul 19 2021
I am OK with this. You may want to move those part of the checking code to a separate function for readability. (since this check is independent from lifetime sinking, we could also consider doing the check earlier for all corosuspends too.
Jul 14 2021
Jul 13 2021
Jun 30 2021
Jun 29 2021
Put the post-split ramp function back to the CGSCC worklist
If coroutine ramp function couldn't get inlined, it would disable coroutine elide optimization. Could you elaborate more on why do you want to do that?
Cool! fix test failure?
If there is an inlining opportunity, it should have happened pre-split, right? Is there any reason it didn't happen pre-split but only post-split?
It seems like that we don't need the attribute CORO_PRESPLIT_ATTR any more, do we? If yes, I think we should remove them.
It's still needed by the legacy pass manager. I don't want to break that yet.
Jun 28 2021
After removing the legacy test command, I was finally able to update this patch. It's now ready for review. I will update the decription to reflect to the latest changes
Yeah, but it may be inlined after splitting, which could trigger coro elide.
In O0, there is no inliner pass (after CoroSplit), so inlining should never happen.
Jun 26 2021
By the way, while you are at it, I was looking at some of the sink lifetime tests, and I was confused by this test:
It's supposed to test that we can sink the lifetime (and hence will not put the alloca on the frame). But looks like we are not able to sink it and the test is testing that it will be on the frame?
Jun 25 2021
Jun 24 2021
I believe the plan is to have all tests to be in -passes style with new pm (I.e -passes is the preferred style)
Jun 23 2021
I guess I am fine with solution 1 and see how far we can go with it.
In the long run, we should look at cases where this optimization (sinkLifetimeStartMarkers) is effective and figure out why the lifetime start markers are emitted early, so that we could improve the lifetime marker creation in the first place. Overall it still feels a bit strange to move lifetime markers around (a sign that the front-end didn't do the right thing)
Jun 22 2021
To be honest, none of the solutions seems satisfying but I couldn't propose a better one.
Jun 17 2021
Add test case for noalias argument
Jun 16 2021
Abandon this in favor of D104184
Jun 15 2021
Jun 14 2021
I updated the description to reflex the summary of discussions from previous patches. Based on the discussion, we believe that this should be the right approach
only check byval attribute
From clang's perspective, what it needs to ensure is that every argument is stored in an allocation local to the callee (in other words, allocations corosplit can handle). "allocations corosplit can handle" should be either allocas, or arguments marked byval. These are pointers that don't exist in the caller, so corosplit can mess with the address in memory without worrying about code it can't see in the caller.
How do you prevent optimization passes (prior to CoroSplit) from messing up with those pointers, though?
Let's say we have an argument that's not marked with byval (due to target specifics) but is pass-by-value in C++. In Clang we could emit IR to copy the argument value to a local alloca. However latter optimization passes could very well optimize out that copy because the copy would appear to be useless without considering what coro_suspend does, which is why we are changing BasicAA to prevent such elimination.
Or are you suggesting we need to modify Clang such that every pass-by-value argument in C++ must be marked with byval?
Unfortunately what we really care about here is whether an argument is by-value from C++ perspective, not just whether it has a byval attribute. Let me try to explain the problem top-down and see if I got it wrong somewhere:
- In C++, when we call a function with a parameter passed by value, such as foo(std::move(a)), the passed value is typically a temporary value allocated at caller, and only the pointer is passed to the callee. When callee returns, the caller will be able to deallocate that temporary value's memory (in LLVM IR terms, there will be a lifetime.end marking the temporary value after the callee returns).
- If the callee is a normal routine (non-coroutine), the callee don't need to worry about the lifetime of the parameter because it will stay alive through the entire life cycle of the callee.
- However if the callee is a coroutine, the situation is different because the coroutine can return in the middle of the function and latter resume. As soon as the coroutine returns for the first time (i.e. suspended the first time), the temporary value (the pass-by-value parameter) would die in the caller. But if the callee needs to use that argument after coroutine resumption, we need to make sure that the argument value will be put on the coroutine frame (similar to allocas), so that latter access won't be accessing invalid memory.
- However there is one major difference between a parameter and an alloca: for a parameter that is a pointer type, we need to know whether we need to copy the value that the pointer points to to the coroutine frame, or just the pointer it self. For instance, you could call a coroutine by passing a direct pointer, and in that case we should only need to copy the pointer to the frame. But for the C++ pass-by-value arguments, we need to copy the value that the pointer points to to the frame. The question is then, during CoroSplit (the pass where we put things to the frame), are we able to tell, given an argument pointer, whether we need to copy the value or just the pointer to the frame? Knowing that a C++ pass-by-value argument doesn't necessarily end up with an argument with a byval attribute, we know that it is not possible to tell, hence we cannot deal with this properly only in CoroSplit pass.
That this sound right?
Jun 13 2021
I just realized that this is now getting very similar to D101980, and we are facing the same problem of not always be able to tell whether a param is byval in the midend. Let me continue the discussion on the BasicAA patch.
Jun 12 2021
Jun 11 2021
hmm sounds like hasPassPointeeByValueCopyAttr() is actually what I need. In CoroSplit, when deciding what to put on the coroutine frame, for arguments, I will need to decide whether it's enough to just put the pointer on the frame, or if I need to copy the content of the pointer (much like allocas) to the frame. Sounds like for inalloca and preallocated memory I still want to copy the content to the frame, not just the pointer.
Poking around it looks like this should be handled for inlining, not sure why the previous case we ran into wasn't covered by that code below from CallAnalyzer::analyze()// Disallow inlining a blockaddress with uses other than strictly callbr. // A blockaddress only has defined behavior for an indirect branch in the // same function, and we do not currently support inlining indirect // branches. But, the inliner may not see an indirect branch that ends up // being dead code at a particular call site. If the blockaddress escapes // the function, e.g., via a global variable, inlining may lead to an // invalid cross-function reference. // FIXME: pr/39560: continue relaxing this overt restriction. if (BB->hasAddressTaken()) for (User *U : BlockAddress::get(&*BB)->users()) if (!isa<CallBrInst>(*U)) return InlineResult::failure("blockaddress used outside of callbr");
LGTM, thanks for the fix.
Jun 10 2021
OK, so the way coroutine codegen works is to first copy all parameters to local alloca, and then if any of those allocas need to live through coroutine suspension, the alloca will be put on the coroutine frame during the CoroSplit pass.
Jun 9 2021
I found a more proper fix in D104007.
Jun 8 2021
Only skip regions that contain BBs with address taken
Add test case
I think so, as long as the cloned BB is in the same module as the original one, we need to worry about this (there are legit cloning in some tools such as llvm-extract, llvm-reduce..)