This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Expose CoawaitExpr's operand in the AST
ClosedPublic

Authored by nridge on Dec 6 2021, 1:53 PM.

Details

Summary

Previously the Expr returned by getOperand() was actually the
subexpression common to the "ready", "suspend", and "resume"
expressions, which often isn't just the operand but e.g.
await_transform() called on the operand.

It's important for the AST to expose the operand as written
in the source for traversals and tools like clangd to work
correctly.

Fixes https://github.com/clangd/clangd/issues/939

Diff Detail

Event Timeline

nridge created this revision.Dec 6 2021, 1:53 PM

FWIW I agree the "store" option would be more reliable here and also simpler - I think it's just storing one more pointer per co_await, and one we have easy access to.
(I would be worried about what happens to that expression in template instantiation, but dependent CoroutineSuspendExprs only store the written form anyway, so it probably actually becomes cleaner)

A change that probably goes along with this one is having RecursiveASTVisitor traverse the syntactic form instead of the semantic one if traversal of implicit code is off.

nridge updated this revision to Diff 407028.Feb 8 2022, 6:56 PM

Rework to store operand instead of digging it out from the common-expr

(Keeping this in the WIP state for now, there's some crashiness I need to sort out)

nridge updated this revision to Diff 420107.Apr 3 2022, 11:50 PM

Add clang test

Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2022, 11:50 PM
nridge retitled this revision from [WIP] [clangd] Attempted fix for issue 939 to [clangd] Expose CoawaitExpr's operand in the AST.Apr 3 2022, 11:51 PM
nridge edited the summary of this revision. (Show Details)
nridge published this revision for review.Apr 3 2022, 11:54 PM
nridge added reviewers: sammccall, kadircet.

Can't seem to reproduce the crashiness any more, so I wrote a clang test and I'm submitting this for review.

I'm happy to take guidance on what the clang test should look like. What I have currently is a FileCheck test to verify that a CoawaitExpr has the expected subexpressions. I added a new file for this because the existing coroutine-related tests that I could find did not use FileCheck (they were just testing for expected-errors and such).

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 3 2022, 11:54 PM

A change that probably goes along with this one is having RecursiveASTVisitor traverse the syntactic form instead of the semantic one if traversal of implicit code is off.

I believe that's already the behaviour that falls out of the current implementation:

DEF_TRAVERSE_STMT(CoawaitExpr, {
  if (!getDerived().shouldVisitImplicitCode()) {
    TRY_TO_TRAVERSE_OR_ENQUEUE_STMT(S->getOperand());
    ShouldVisitChildren = false;
  }
})

If we are not visiting implicit code, we only visit the operand, otherwise we visit all subexpressions.

This generally looks really good, I'm worried about the separate transformation in the template case though.

clang/include/clang/AST/ExprCXX.h
4728

I think a comment here like // The syntactic operand written in the code or so would help clarify the distinction between this and the common/ready/suspend/resume family.

I'm just thinking of all the times working on tooling when you're trying to understand how to handle every node type without deep-diving into the semantics of each one :-)

Also maybe group getOperand(), getKeywordLoc(), getBeginLoc(), getEndLoc() together?

clang/lib/Sema/SemaCoroutine.cpp
737 ↗(On Diff #420107)

I'm not sure Suspend.get() here is the most appropriate value for Operand.

It's a little abstract, since we're talking about the syntactic operand of an implicit call :-)
But for consistency with the explicit case, I think the operand should be the result of initial_suspend() etc, *before* we call operator co_await.

i.e. the value of Suspend.get() a couple of lines above.

838 ↗(On Diff #420107)

This has become confusing I think. The old code was was changing the meaning of the E, because we didn't need the original anymore. Rather than introduce a new variable for the original meaning, I'd prefer to introduce one for the changed meaning:

// line 848
auto *Transformed = E;
if (lookupMember(...)) {
  ...
  Transformed = R.get();
}
buildOperatorCoAwaitCall(..., Transformed);

(if you also want to rename E -> Operand, that makes sense to me too!)

858 ↗(On Diff #420107)

(aside, this variable name is really unfortunate: I think E here is the awaitable, and Awaitable is the _awaiter_. If this is right, feel free to change it or not...)

865 ↗(On Diff #420107)

Not really your fault, but having two Exprs Operand and E is terribly confusing in isolation. I think E is Awaiter in standardese. Again feel free to change or not.

clang/lib/Sema/TreeTransform.h
7934 ↗(On Diff #420107)

Ah, this doesn't *sound* right - it's going to create duplicate subexprs I think, and we should really try to have the operand still be a subexpr of the commonexpr like in the non-instantiated case.

TransformInitListExpr() is a fairly similar case, and it just discards the semantic form and rebuilds from the syntactic one. TransformPseudoObjectExpr seems to be similar.

I suppose the analogous thing to do here would be to have RebuildCoawaitExpr call BuildUnresolvedCoawaitExpr with the operand only, but this is a large change! (And suggests maybe we should stop building the semantic forms of dependent coawaits entirely, though that probably would regress diagnostics).
Maybe worth giving this a spin anyway?

@rsmith, care to weight in here?

7947 ↗(On Diff #420107)

(FWIW I don't know what "injected into a new context" means, but I don't think the promise type can change since DependentCoawaitExpr was added in 20f25cb6dfb3364847f4c570b1914fe51e585def.)

nridge updated this revision to Diff 423343.Apr 18 2022, 12:48 AM
nridge marked 5 inline comments as done.

Address review comments

nridge added inline comments.Apr 18 2022, 12:50 AM
clang/lib/Sema/SemaCoroutine.cpp
858 ↗(On Diff #420107)

I believe you're right, changed it

865 ↗(On Diff #420107)

Agreed, changed this as well

clang/lib/Sema/TreeTransform.h
7959 ↗(On Diff #423343)

Note, I copied this comment from here.

7934 ↗(On Diff #420107)

I suppose the analogous thing to do here would be to have RebuildCoawaitExpr call BuildUnresolvedCoawaitExpr with the operand only,

I gave this a try.

7947 ↗(On Diff #420107)

Is the implication of this that there are some conditions under which we can reuse the original CoawaitExpr rather than rebuilding it?

sammccall accepted this revision.Apr 20 2022, 2:26 AM

This seems right to me!

clang/lib/Sema/TreeTransform.h
7947 ↗(On Diff #420107)

Yes, I think this is how instantiation of non-dependent code works, e.g. in the following TransformObjCAtTryStmt:

// If nothing changed, just retain this statement.
if (!getDerived().AlwaysRebuild() &&
    TryBody.get() == S->getTryBody() &&
    !AnyCatchChanged &&
    Finally.get() == S->getFinallyStmt())
  return S;

I think missing this opportunity means that a non-dependent co_await will nevertheless be instantiated into a new node, which in turn means that every containing node will *also* be instantiated (because its child nodes changed during instantiation).
So this affects speed and memory usage, but I suppose not correctness.

Anyway I don't know exactly what the conditions are where it's safe to reuse the node, I suppose we leave this as-is.

This revision is now accepted and ready to land.Apr 20 2022, 2:26 AM

Update on this: I'm working on getting the failing tests to pass. Some of them just need expected AST dumps updated to reflect the new node, but some others suggest the patch actually causes semantic regressions.

Here is a reduced testcase for the semantic regression:

namespace std {
template <class PromiseType = void>
struct coroutine_handle {
  static coroutine_handle from_address(void *) noexcept;
};
template <class Ret, class... Args>
struct coroutine_traits;
} // end of namespace std

template <typename Promise> struct coro {};
template <typename Promise, typename... Ps>
struct std::coroutine_traits<coro<Promise>, Ps...> {
  using promise_type = Promise;
};

struct awaitable {
  bool await_ready() noexcept;
  template <typename F>
  void await_suspend(F) noexcept;
  void await_resume() noexcept;
} a;

struct transform_awaitable {};

struct transform_promise {
  coro<transform_promise> get_return_object();
  awaitable initial_suspend();
  awaitable final_suspend() noexcept;
  awaitable await_transform(transform_awaitable);
  void unhandled_exception();
  void return_void();
};

template <typename T, typename U>
coro<T> await_template(U t) {
  co_await t;
}

template coro<transform_promise> await_template<transform_promise>(transform_awaitable);

Without my patch, this compiles fine. With it, I clang gives the following errors:

coroutines.cpp:35:9: error: no viable conversion from 'awaitable' to 'transform_awaitable'
coro<T> await_template(U t) {
        ^~~~~~~~~~~~~~
coroutines.cpp:39:34: note: in instantiation of function template specialization 'await_template<transform_promise, transform_awaitable>' requested here
template coro<transform_promise> await_template<transform_promise>(transform_awaitable);
                                 ^
coroutines.cpp:23:8: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'awaitable' to 'const transform_awaitable &' for 1st argument
struct transform_awaitable {};
       ^
coroutines.cpp:23:8: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'awaitable' to 'transform_awaitable &&' for 1st argument
struct transform_awaitable {};
       ^
coroutines.cpp:29:48: note: passing argument to parameter here
  awaitable await_transform(transform_awaitable);
                                               ^
coroutines.cpp:35:9: note: call to 'await_transform' implicitly required by 'co_await' here
coro<T> await_template(U t) {
        ^~~~~~~~~~~~~~
1 error generated.

Without my patch, this compiles fine. With it, I clang gives the following errors:

coroutines.cpp:35:9: error: no viable conversion from 'awaitable' to 'transform_awaitable'
coro<T> await_template(U t) {
        ^~~~~~~~~~~~~~
coroutines.cpp:39:34: note: in instantiation of function template specialization 'await_template<transform_promise, transform_awaitable>' requested here
template coro<transform_promise> await_template<transform_promise>(transform_awaitable);

Poked at this a bit locally.
From the message we see there's a call to BuildUnresolvedCoawaitExpr with an awaitable rather than transform_awaitable, so the await_transform call fails.

It turns out this is the initial_suspend call (CXXMemberCallExpr). We're not supposed to transform initial/final suspends, but the code is now doing so.

This seems to be a consequence of switching from BuildResolvedCoawaitExpr (which does not apply the transform) to BuildUnresolvedCoawaitExpr (which does) during template instantiation.

nridge updated this revision to Diff 426356.May 2 2022, 1:39 AM

Handle implicit coawait-exprs properly in RebuildCoawaitExpr, other test fixes

nridge added a comment.May 2 2022, 1:43 AM

We're not supposed to transform initial/final suspends, but the code is now doing so.

This seems to be a consequence of switching from BuildResolvedCoawaitExpr (which does not apply the transform) to BuildUnresolvedCoawaitExpr (which does) during template instantiation.

Thanks, this helped me get onto the right track for a fix.

Indeed, we have two different codepaths by which CoawaitExprs are built: explicit ones use BuildUnresolvedCoawaitExpr, while the implicit ones for the suspends use BuildResolvedCoawaitExpr with a bit of extra logic before (to apply operator coawait (but not await_transform) if appropriate).

Thankfully, CoawaitExpr remembers which kind it is in isImplicit(), so it's straightforward to get TreeTransform to do the right thing based on this flag.

sammccall accepted this revision.May 2 2022, 2:35 AM

Looks good to me - this still seems hairy and abstract to me so I'd "expect the unexpected" in terms of bot or downstream breakage...

clang/include/clang/Sema/Sema.h
10394 ↗(On Diff #426356)

nit: update param names here

clang/lib/Sema/SemaCoroutine.cpp
824 ↗(On Diff #426356)

this name is a bit confusing - the result isn't (necessarily) unresolved, it's more like the inputs are resolved.

I don't actually think we should rename it, but maybe add a comment like Attempts to resolve and build a CoAwaitExpr from "raw" inputs, bailing out to DependentCoawaitExpr if needed

(Not caused by this patch, but I feel like the only time to add comments to make this less confusing is when we've been digging into it)

clang/lib/Sema/TreeTransform.h
1476 ↗(On Diff #426356)

This is essentially inlining BuildUnresolvedCoawaitExpr, with the await_transform logic dropped, right? I wonder how that compares to adding a bool Transform to that function.

Such a param would make everything less direct, but it also seems possible that the two copies could unexpectedly diverge either now (e.g. could the "placeholder type" logic from BuildUnresolved be relevant here?) or more likely in the future.

Up to you, I don't feel strongly here, it just seems a little surprising.

clang/test/AST/coroutine-locals-cleanup-exp-namespace.cpp
88 ↗(On Diff #426356)

nit: I think it'd be nice to include the type here for consistency

nridge updated this revision to Diff 427998.May 9 2022, 1:14 AM
nridge marked 2 inline comments as done.

Address review comments

clang/lib/Sema/TreeTransform.h
1476 ↗(On Diff #426356)

This is essentially inlining BuildUnresolvedCoawaitExpr, with the await_transform logic dropped, right?

Not exactly. Rather, it's attempting to replicate the logic here, which is where implicit CoawaitExpr's are built in the first place.

(The only reason I didn't factor out those few lines into a function of their own is that here we already have the UnresolvedLookupExpr for the operator coawait, whereas there it's calling a helper that builds the lookup-expr followed by calling Sema::BuildOperatorCoawaitCall().)

So, precisely because BuildUnresolvedCoawaitExpr does a few extra things like the "placeholder type" logic that the code we're trying to replicate doesn't, I think it's safer not to try and reuse that here.

clang/test/AST/coroutine-locals-cleanup-exp-namespace.cpp
88 ↗(On Diff #426356)

A full line of example output here is:

CXXBindTemporaryExpr 0x1898248 <col:14, col:18> 'Task' (CXXTemporary 0x1898248)

not sure what type you have in mind besides 'Task', which is already there.

nridge added a comment.May 9 2022, 1:18 AM

There is one remaining test failure, related to a duplicate diagnostic. I've reduced it to the following:

namespace std {

template <class Ret, class... Args>
struct coroutine_traits {};
} // end of namespace std

struct awaitable {
  bool await_ready() noexcept;
  template <typename F>
  void await_suspend(F) noexcept;
  void await_resume() noexcept;
} a;

struct suspend_always {
  bool await_ready() noexcept { return false; }
  template <typename F>
  void await_suspend(F) noexcept;
  void await_resume() noexcept {}
};

struct yielded_thing { const char *p; short a, b; };

struct promise {
  void get_return_object();
  suspend_always initial_suspend();
  suspend_always final_suspend() noexcept;
  awaitable yield_value(int);
  awaitable yield_value(yielded_thing);
  void return_value(int);
  void unhandled_exception();
};

template <typename... T>
struct std::coroutine_traits<void, T...> { using promise_type = promise; };

namespace std {
template <class PromiseType = void>
struct coroutine_handle {
  static coroutine_handle from_address(void *) noexcept;
};
template <>
struct coroutine_handle<void> {
  template <class PromiseType>
  coroutine_handle(coroutine_handle<PromiseType>) noexcept;
  static coroutine_handle from_address(void *) noexcept;
};
} // namespace std

void yield() {
  co_yield 0;
  co_yield {1e100}; // expected-error {{cannot be narrowed}} expected-note {{explicit cast}} expected-warning {{implicit conversion}} expected-warning {{braces around scalar}}
  co_yield {"foo", __LONG_LONG_MAX__}; // expected-error {{cannot be narrowed}} expected-note {{explicit cast}} expected-warning {{changes value}}
}

The last two lines each trigger an "implicit conversion" warning (among other diagnostics), which is expected, but with my patch the "implicit conversion" warnings appear twice each.

nridge added a comment.May 9 2022, 1:56 AM

I think the issue is related to this loop in AnalyzeImplicitConversions(), which iterates over Expr::children(), and adds each child to a list of expressions to be checked for implicit conversions.

CoroutineSuspendExpr now has the operand as an extra child, and an implicit conversion in the operand gets diagnosed both when processing the operand, and the common-expr.

I'm guessing this will need an explicit carve-out in AnalyzeImplicitConversions. There is already some special handling of other expressions types there, including at least one whose purpose is to avoid duplicate diagnostics.

I think the issue is related to this loop in AnalyzeImplicitConversions(), which iterates over Expr::children(), and adds each child to a list of expressions to be checked for implicit conversions.

CoroutineSuspendExpr now has the operand as an extra child, and an implicit conversion in the operand gets diagnosed both when processing the operand, and the common-expr.

I'm guessing this will need an explicit carve-out in AnalyzeImplicitConversions. There is already some special handling of other expressions types there, including at least one whose purpose is to avoid duplicate diagnostics.

Yeah, I agree.
The general issue here is that storing alternate representations that share subexpressions can cause duplicated traversal. I guess having RecursiveASTVisitor do the right thing will cover most cases.

PseudoObjectExpr has this pattern (for objc property access), InitListExpr kind of does this (syntactic vs semantic forms). For both, I don't fully understand the mechanism, and it seems to require hacks in lots of places :-(
FWIW PseudoObjectExpr avoids duplicated diagnostics here by wrapping all its semantic/derived subexpressions inside OpaqueValueExpr, the special case you found... I don't know what the implications of trying to reuse that mechanism would be. The explicit carve-out seems simpler.

clang/lib/Sema/TreeTransform.h
1476 ↗(On Diff #426356)

Thank you, that makes a lot of sense.
Could you add to the comment something like:
// This mirrors how the implicit CoAwaitExpr is originally created in Sema::ActOnCoroutineBodyStart?

clang/test/AST/coroutine-locals-cleanup-exp-namespace.cpp
88 ↗(On Diff #426356)

I'm not sure what I meant either, sorry! Looks good.

nridge updated this revision to Diff 429611.May 15 2022, 10:52 PM

Fix last test case and review comment

nridge marked an inline comment as done.May 15 2022, 10:52 PM
This revision was landed with ongoing or failed builds.May 17 2022, 5:14 AM
This revision was automatically updated to reflect the committed changes.