Page MenuHomePhabricator

[Coroutine] Properly deal with byval and noalias parameters
ClosedPublic

Authored by lxfind on Jun 12 2021, 10:12 AM.

Details

Summary

This patch is to address https://bugs.llvm.org/show_bug.cgi?id=48857.
Previous attempts can be found in D104007 and D101980.
A lot of discussions can be found in those two patches.
To summarize the bug:
When Clang emits IR for coroutines, the first thing it does is to make a copy of every argument to the local stack, so that uses of the arguments in the function will all refer to the local copies instead of the arguments directly.
However, in some cases we find that arguments are still directly used:
When Clang emits IR for a function that has pass-by-value arguments, sometimes it emits an argument with byval attribute. A byval attribute is considered to be local to the function (just like alloca) and hence it can be easily determined that it does not alias other values. If in the IR there exists a memcpy from a byval argument to a local alloca, and then from that local alloca to another alloca, MemCpyOpt will optimize out the first memcpy because byval argument's content will not change. This causes issues because after a coroutine suspension, the byval argument may die outside of the function, and latter uses will lead to memory use-after-free.
This is only a problem for arguments with either byval attribute or noalias attribute, because only these two kinds are considered local. Arguments without these two attributes will be considered to alias coro_suspend and hence we won't have this problem. So we need to be able to deal with these two attributes in coroutines properly.
For noalias arguments, since coro_suspend may potentially change the value of any argument outside of the function, we simply shouldn't mark any argument in a coroutiune as noalias. This can be taken care of in CoroEarly pass.
For byval arguments, if such an argument needs to live across suspensions, we will have to copy their value content to the frame, not just the pointer.

Diff Detail

Event Timeline

lxfind created this revision.Jun 12 2021, 10:12 AM
lxfind requested review of this revision.Jun 12 2021, 10:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2021, 10:12 AM

I just realized that this is now getting very similar to D101980, and we are facing the same problem of not always be able to tell whether a param is byval in the midend. Let me continue the discussion on the BasicAA patch.

lxfind updated this revision to Diff 352016.Jun 14 2021, 4:37 PM

only check byval attribute

efriedma added inline comments.Jun 14 2021, 4:49 PM
llvm/lib/Transforms/Coroutines/CoroEarly.cpp
224

Hmm...

It might make sense to fix this in clang, rather than here, but I guess this is okay?

llvm/lib/Transforms/Coroutines/CoroFrame.cpp
1605

Use memcpy to copy large values; load+store results in very inefficient code.

lxfind retitled this revision from [Coroutine] Put byval params' value into frame, instead of just pointer to [Coroutine] Properly deal with byval and noalias parameters.Jun 14 2021, 5:04 PM
lxfind edited the summary of this revision. (Show Details)
lxfind added inline comments.Jun 14 2021, 5:22 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
1605

Good to know. We actually use load+store for all the frame rewrites. We can look at changing all of that in a separate patch.

This comment was removed by ChuanqiXu.

I updated the description to reflex the summary of discussions from previous patches. Based on the discussion, we believe that this should be the right approach

The overall patch looks good to me.

llvm/lib/Transforms/Coroutines/CoroFrame.cpp
1141–1142

not just the pointer itself.

'instead of the pointer itself' since we wouldn't store the pointer anymore, right?

1605

Use memcpy to copy large values; load+store results in very inefficient code.

Would the successive pass convert load+store to memcpy?

lxfind updated this revision to Diff 352041.Jun 14 2021, 9:24 PM
lxfind edited the summary of this revision. (Show Details)

update comment

lxfind added inline comments.Jun 15 2021, 8:09 AM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
1605

I think it would

efriedma added inline comments.Jun 15 2021, 12:18 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
1605

At -O0, we probably end up with a mess. At higher optimization levels, I think we have some code in instcombine to transform weird load/store operations; not sure if we reliably end up with a memcpy. In any case, no reason to rely on that transform here.

ChuanqiXu added inline comments.Jun 15 2021, 6:53 PM
llvm/lib/Transforms/Coroutines/CoroFrame.cpp
1605

Thanks, I got it.

ChuanqiXu added inline comments.Jun 16 2021, 7:47 PM
llvm/test/Transforms/Coroutines/coro-byval-param.ll
7

Since this patch deals with 'byval' and 'noalias' arguments, it'd better to add a test case for 'noalias' arguments.

lxfind added inline comments.Jun 16 2021, 8:36 PM
llvm/test/Transforms/Coroutines/coro-byval-param.ll
7

I am actually not sure what would be a reasonable case that involves noalias (I could just duplicate foo with a noalias arg, but I am hoping it can be more realistic)

ChuanqiXu added inline comments.Jun 16 2021, 10:34 PM
llvm/test/Transforms/Coroutines/coro-byval-param.ll
7

Yeah, I spent a little time and still don't find realistic examples. If we can't get it in short time, I think the covering the codes may be important too.

lxfind updated this revision to Diff 352756.Jun 17 2021, 9:21 AM

Add test case for noalias argument

Looking at clang a bit more, I think the only way you end up with noalias arguments at the moment is via "__restrict". Which ends up looking basically like your testcase. Probably not too important in practice.

Patch looks fine from my side, but I'd like to leave final approval to someone more familiar with the coroutine code.

This revision is now accepted and ready to land.Jun 17 2021, 7:01 PM
This revision was landed with ongoing or failed builds.Jun 17 2021, 7:06 PM
This revision was automatically updated to reflect the committed changes.

Looks like this breaks tests on windows: http://45.33.8.238/win/40255/step_11.txt

Please take a look and revert for now if it takes a while to fix.

Looks like this breaks tests on windows: http://45.33.8.238/win/40255/step_11.txt

Please take a look and revert for now if it takes a while to fix.

Although I only test this on Linux, it's hard to believe that this patch would affect that test case. Since this one only touched coroutine module, it shouldn't affect IR without llvm.coro.* intrinsics.

Looking at clang a bit more, I think the only way you end up with noalias arguments at the moment is via "__restrict". Which ends up looking basically like your testcase. Probably not too important in practice.

Patch looks fine from my side, but I'd like to leave final approval to someone more familiar with the coroutine code.

Clang can also introduce noalias arguments when a function returns a struct.