The coroutineBodyStmt matcher matches CoroutineBodyStmt AST nodes.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thank you for this! FWIW, it looks like precommit CI found some issues that need to be resolved.
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
5498 | I'm not certain it makes sense to me to add CoroutineBodyStmt to hasBody -- in this case, it doesn't *have* a body, it *is* the body. | |
clang/lib/ASTMatchers/Dynamic/Registry.cpp | ||
171 ↗ | (On Diff #485748) | Please keep this sorted alphabetically. |
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
5498 | With respect to hasBody(), my intent was to treat the CoroutineBodyStmt node as analogous to a FunctionDecl or WhileStmt. WhileStmts have information like the loop condition expression, as CoroutineBodyStmts contain the synthesized expressions such as the initial suspend Stmt. Both WhileStmts and CoroutineBodyStmts then have the getBody() methods, usually a CompoundStmt for WhileStmts and either a CompoundStmt or TryStmt for CoroutineBodyStmts. Ultimately, I wanted to be able to match the CoroutineBodyStmt's function-body (using the grammar) from the standard, e.g., coroutineBodyStmt(hasBody(compoundStmt().bind(...))). If there is a different approach you'd recommend that's in line with the AST matcher design strategy, I can use that instead. |
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
5498 | What concerns me about the approach you have is that it doesn't model the AST. A coroutine body statement is a special kind of function body, so it does not itself *have* a body. So this is a bit like adding CompoundStmt to the hasBody matcher in that regard. To me, the correct way to surface this would be to write the matcher: functionDecl(hasBody(coroutineBodyStmt())) to get to the coroutine body, and if you want to bind to the compound statement, I'd imagine functionDecl(hasBody(coroutineBodyStmt(has(compoundStmt().bind(...))))) would be the correct approach. My thinking there is that we'd use the has() traversal matcher to go from the coroutine body to any of the extra information it tracks (the compound statement, the promise, allocate/deallocate, etc). Pinging @klimek and @sammccall for additional opinions. |
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
5498 | I think I see. With my proposal, the match would be functionDecl(hasBody(coroutineBodyStmt(hasBody(stmt().bind(...))))). For completeness, your suggestion would need functionDecl(hasBody(coroutineBodyStmt(has(stmt(anyOf(cxxTryStmt(), compoundStmt()).bind(...)))))). I am a bit hung up though on two things, and clarifications on both would help solidify my understanding of the design philosophy of the matchers. First, since CoroutineBodyStmt has a getBody() method, I assumed it would make it eligible for the hasBody matcher, although perhaps I am making an incorrect assumption/generalization here? Second, the has() approach seems less accurate since we would be relying on the fact that the other children nodes of CoroutineBodyStmt (initial or final suspend point, etc) would never be a CompoundStmt or CXXTryStmt, else we might get unintentional matches. Or, one might miss the CXXTryStmt corner case. Follow up question - should a need arise (say, authoring many clang-tidy checks that need extensive coroutine matcher support on the initial suspend point etc), would we see the matchers supporting things like coroutineBodyStmt(hasInitialSuspendPoint(...), hasFinalSuspendPoint(..)), or rely on a combination of has approach / non-ASTMatcher logic (or locally defined ASTMatchers within the clang-tidy library)? It looks like this phab can be broken down into two changes - first, the addition of a coroutineBodyStmt matcher, and second, a hasBody traversal matcher for coroutineBodyStmt. Happy to separate these out depending on the direction of the discussion. |
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
5498 |
As I understand the evolution here, we started out with hasBody() to match on function declaration AST nodes with a body. It then evolved to include do/while/for (and eventually range-based for) loops, which is a bit of an odd decision (why not anything with a substatement, like if or switch?). Now we're looking to add support for the coroutine body node, but that has a half dozen+ different things we can traverse to, one of which is the compound statement for the coroutine body. All of these cases are kind of different from one another, so the design is a bit confusing (at least to me).
That's a fantastic point that I hadn't considered. We probably will want named traversal methods to go from the coroutine body statement to the other kinds of AST nodes it supports. And based on that, hasBody() is a quite reasonable name because that's one of the other kinds of AST nodes.
I'm leaning more towards introducing new traversal matchers. In terms of what goes into ASTMatchers.h, we usually only add matchers once there are multiple in-tree needs for the matcher, otherwise we recommend using a locally defined matcher. ASTMatchers.h is quite expensive to compile due to all the template shenanigans, and so we've resisted adding matchers unless they're "needed" to try to keep compile times reasonable. Hmmm, I'm coming around to reusing hasBody() like you've done here. I think it's pretty weird to read: functionDecl(hasBody(coroutineBodyStmt(hasBody(stmt().bind(...))) because of the duplicated calls to hasBody, but I think it's more defensible now than I was originally thinking. I'm still curious if other folks who care about AST matchers have an opinion given that this is a bit trickier than usual. |
Hmmm, I'm coming around to reusing hasBody() like you've done here. I think it's pretty weird to read: functionDecl(hasBody(coroutineBodyStmt(hasBody(stmt().bind(...))) because of the duplicated calls to hasBody, but I think it's more defensible now than I was originally thinking. I'm still curious if other folks who care about AST matchers have an opinion given that this is a bit trickier than usual.
I think I'm okay with this direction, but one last try at pinging @klimek and @sammccall in case they've got an opinion. If we don't hear from them by Thur of this week, I'd say it's fine to move forward with the patch as-is.
clang/docs/ReleaseNotes.rst | ||
---|---|---|
871 ↗ | (On Diff #487789) | You should also mention that the hasBody matcher now works with coroutine bodies. |
It seems pretty weird to me that the two child edges from functiondecl -> coroutinebodystmt -> compoundstmt are both called "body".
However that *is* what they're called in the AST today, and having the matchers correspond to the AST's names seems important.
So this seems OK to me (renaming the inner "body" edge to something else seems like a good idea, but is out of scope here).
Thanks both! Assuming this passes build, would one of you mind committing this for me? Chris Cotter <ccotter14@bloomberg.net>
Happy to do so! I've landed it on your behalf in dcf5ad89dcc5cc69b9df3e5dd9be71c65642f519
clang-format not found in user’s local PATH; not linting file.