This is an archive of the discontinued LLVM Phabricator instance.

[Coroutine][Sema] Cleanup temporaries as early as possible
ClosedPublic

Authored by lxfind on Nov 6 2020, 4:29 PM.

Details

Summary

Clang front-end emits an AST that looks like this for an co_await expression:

|- ExprWithCleanups

|- -CoawaitExpr

  |- -MaterializeTemporaryExpr ... Awaiter
    ...
  |- -CXXMemberCallExpr ... .await_ready
    ...
  |- -CallExpr ... __builtin_coro_resume
    ...
  |- -CXXMemberCallExpr ... .await_resume
    ...

ExprWithCleanups is responsible for cleaning up (including calling dtors) for the temporaries generated in the wrapping expression).
In the above structure, the __builtin_coro_resume part (which corresponds to the code for the suspend case in the co_await with symmetric transfer), the pseudocode looks like this:

__builtin_coro_resume(
 awaiter.await_suspend(
   from_address(
     __builtin_coro_frame())).address());

One of the temporaries that's generated as part of this code is the coroutine handle returned from awaiter.await_suspend() call. The call returns a handle which is a prvalue (since it's a returned value on the fly). In order to call the address() method on it, it needs to be converted into an xvalue. Hence a materialized temp is created to hold it. This temp will need to be cleaned up eventually. Now, since all cleanups happen at the end of the entire co_await expression, which is after the <coro.suspend> suspension point, the compiler will think that such a temp needs to live across suspensions, and need to be put on the coroutine frame, even though it's only used temporarily just to call address() method.
Such a phenomena not only unnecessarily increases the frame size, but can lead to ASAN failures, if the coroutine was already destroyed as part of the await_suspend() call. This is because if the coroutine was already destroyed, the frame no longer exists, and one can not store anything into it. But if the temporary object is considered to need to live on the frame, it will be stored into the frame after await_suspend() returns.

A fix attempt was done in D87470. Unfortunately it is incorrect. The reason is that cleanups in Clang works more like linearly than nested. There is one current state indicating whether it needs cleanup, and an ExprWithCleanups resets that state. This means that an ExprWithCleanups must be capable of cleaning up all temporaries created in the wrapping expression, otherwise there will be dangling temporaries cleaned up at the wrong place.
I eventually found a walk-around (D89066) that doesn't break any existing tests while fixing the issue. But it targets the final co_await only. If we ever have a co_await that's not on the final awaiter and the frame gets destroyed after suspend, we are in trouble. Hence we need a proper fix.

This patch is the proper fix. It does the folllowing things to fully resolve the issue:

  1. The AST has to be generated in the order according to their nesting relationship. We should not generate AST out of order because then the code generator would incorrectly track the state of temporaries and when a cleanup is needed. So the code in buildCoawaitCalls is reorganized so that we will be generating the AST for each coawait member call in order along with their child AST.
  2. await_ready() call is wrapped with an ExprWithCleanups so that temporaries in it gets cleaned up as early as possible to avoid living across suspension.
  3. await_suspend() call is wrapped with an ExprWithCleanups if it's not a symmetric transfer. In the case of a symmetric transfer, in order to maintain the musttail call contract, the ExprWithCleanups is wraaped before the resume call.
  4. In the end, we mark again that it needs a cleanup, so that the entire CoawaitExpr will be wrapped with a ExprWithCleanups which will clean up the Awaiter object associated with the await expression.

Diff Detail

Event Timeline

lxfind created this revision.Nov 6 2020, 4:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 6 2020, 4:29 PM
lxfind requested review of this revision.Nov 6 2020, 4:29 PM
lxfind edited the summary of this revision. (Show Details)Nov 6 2020, 4:31 PM
lxfind added reviewers: bruno, junparser, ChuanqiXu, wenlei.
lxfind edited the summary of this revision. (Show Details)
lxfind edited the summary of this revision. (Show Details)
lxfind edited the summary of this revision. (Show Details)
lxfind edited the summary of this revision. (Show Details)

Very nice explanation, thanks for improving this!

Can you also add a AST dump test? The idea is to test for the presence of ExprWithCleanupss. Something along the lines of clang/test/AST/coroutine-source-location-crash.cpp ignoring the serialization part should be good.

clang/lib/Sema/SemaCoroutine.cpp
475

In case AwaitSuspend is null, is there any need to set Calls.IsInvalid as well?

490

Is there already a test covering this tailcall case? It'd be nice to have one

lxfind added inline comments.Nov 10 2020, 8:21 AM
clang/lib/Sema/SemaCoroutine.cpp
475

Thanks for the catch.

490

Yes both of the symmetric-transfer tests are covering this case.

lxfind added inline comments.Nov 10 2020, 8:31 AM
clang/lib/Sema/SemaCoroutine.cpp
475

Oh actually this is already set in BuildSubExpr.

lxfind updated this revision to Diff 304233.Nov 10 2020, 9:23 AM

Add AST test

bruno accepted this revision.Nov 10 2020, 10:42 AM

LGTM

This revision is now accepted and ready to land.Nov 10 2020, 10:42 AM
This revision was landed with ongoing or failed builds.Nov 10 2020, 1:27 PM
This revision was automatically updated to reflect the committed changes.