This patch is to address https://bugs.llvm.org/show_bug.cgi?id=48857.
Previous attempts can be found in D104007 and D101980.
A lot of discussions can be found in those two patches.
To summarize the bug:
When Clang emits IR for coroutines, the first thing it does is to make a copy of every argument to the local stack, so that uses of the arguments in the function will all refer to the local copies instead of the arguments directly.
However, in some cases we find that arguments are still directly used:
When Clang emits IR for a function that has pass-by-value arguments, sometimes it emits an argument with byval attribute. A byval attribute is considered to be local to the function (just like alloca) and hence it can be easily determined that it does not alias other values. If in the IR there exists a memcpy from a byval argument to a local alloca, and then from that local alloca to another alloca, MemCpyOpt will optimize out the first memcpy because byval argument's content will not change. This causes issues because after a coroutine suspension, the byval argument may die outside of the function, and latter uses will lead to memory use-after-free.
This is only a problem for arguments with either byval attribute or noalias attribute, because only these two kinds are considered local. Arguments without these two attributes will be considered to alias coro_suspend and hence we won't have this problem. So we need to be able to deal with these two attributes in coroutines properly.
For noalias arguments, since coro_suspend may potentially change the value of any argument outside of the function, we simply shouldn't mark any argument in a coroutiune as noalias. This can be taken care of in CoroEarly pass.
For byval arguments, if such an argument needs to live across suspensions, we will have to copy their value content to the frame, not just the pointer.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I just realized that this is now getting very similar to D101980, and we are facing the same problem of not always be able to tell whether a param is byval in the midend. Let me continue the discussion on the BasicAA patch.
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
1605 | Good to know. We actually use load+store for all the frame rewrites. We can look at changing all of that in a separate patch. |
I updated the description to reflex the summary of discussions from previous patches. Based on the discussion, we believe that this should be the right approach
The overall patch looks good to me.
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
1141–1142 |
'instead of the pointer itself' since we wouldn't store the pointer anymore, right? | |
1605 |
Would the successive pass convert load+store to memcpy? |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
1605 | I think it would |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
1605 | At -O0, we probably end up with a mess. At higher optimization levels, I think we have some code in instcombine to transform weird load/store operations; not sure if we reliably end up with a memcpy. In any case, no reason to rely on that transform here. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
1605 | Thanks, I got it. |
llvm/test/Transforms/Coroutines/coro-byval-param.ll | ||
---|---|---|
7 | Since this patch deals with 'byval' and 'noalias' arguments, it'd better to add a test case for 'noalias' arguments. |
llvm/test/Transforms/Coroutines/coro-byval-param.ll | ||
---|---|---|
7 | I am actually not sure what would be a reasonable case that involves noalias (I could just duplicate foo with a noalias arg, but I am hoping it can be more realistic) |
llvm/test/Transforms/Coroutines/coro-byval-param.ll | ||
---|---|---|
7 | Yeah, I spent a little time and still don't find realistic examples. If we can't get it in short time, I think the covering the codes may be important too. |
Looking at clang a bit more, I think the only way you end up with noalias arguments at the moment is via "__restrict". Which ends up looking basically like your testcase. Probably not too important in practice.
Patch looks fine from my side, but I'd like to leave final approval to someone more familiar with the coroutine code.
Looks like this breaks tests on windows: http://45.33.8.238/win/40255/step_11.txt
Please take a look and revert for now if it takes a while to fix.
Although I only test this on Linux, it's hard to believe that this patch would affect that test case. Since this one only touched coroutine module, it shouldn't affect IR without llvm.coro.* intrinsics.
Hmm...
It might make sense to fix this in clang, rather than here, but I guess this is okay?