This provides better support for LambdaCaptures by making them first-
class and allowing them to be bindable. In addition, this implements several
LambdaCapture-related matchers. This does not update how lambdas are
traversed. As a result, something like trying to match lambdaCapture() by
itself will not work - it must be used as an inner matcher.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I'm a bit confused as to why this is necessary. We already have lambdaExpr(hasAnyCapture(varDecl(hasName("whatever")))) and lambdaExpr(hasAnyCapture(cxxThisExpr())), so I'm not certain why we need this as a parallel construct?
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
4629–4630 | The name here is a bit unclear -- whether it is the matcher matching int x; or the x from the capture is not clear from the name. The comment suggests it's matching x from the capture, but I think it's actually matching the int x; variable declaration. Being clear on what's matched here is important when we think about initializers: void foo() { int x = 12; auto f = [x = 100](){}; } and lambdaExpr(hasAnyCapture(lambdaCapture(refersToVarDecl(hasName("x"), hasInitializer(integerLiteral(equals(100)))))) Would you expect this to match? (This might be a good test case to add.) |
Nice! Per Aaron's comment, it's probably worth expanding the patch description to explain the motivation. I assume that the key is making these first-class and therefore bindable?
clang/include/clang/ASTMatchers/ASTMatchFinder.h | ||
---|---|---|
170 ↗ | (On Diff #382123) | Since top-level metching doesn't work, I think we *don't* want to add this overload. |
225 ↗ | (On Diff #382123) | i guess remove this too, if you remove the overload? |
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
4595 | ||
4597–4605 | Maybe add some more examples? e.g. demonstrating that you can match implicit captures seems valuable. For that same code snippet, maybe something like: ... lambdaCapture(refersToVarDecl(hasName("t"))) | |
4629–4630 | In a similar vein, do we want a separate matcher on the name of the capture itself? e.g. an overload of hasName? And what about matchers for the initializers? Those don't have to land in this patch, but do you think those would be doable? |
@aaron.ballman for the purpose of these matchers, what @ymandel said is correct - the goal is to allow LambdaCaptures themselves to be bindable.
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
4629–4630 | I would expect @aaron.ballman's initializer example to match, and I added a similar test case to the one described. I think that if a capture does not have an initializer, then refersToVarDecl will match on the variable declaration before the lambda. However, if a capture does have an initializer, that initializer itself seems to be represented as a VarDecl in the AST, which is the VarDecl that gets matched. For that reason, I think we may not need to have a separate matcher on the name of the capture itself. Additionally, since captures with/without initializers are both represented the same way, there may not be a good way to distinguish between them, so matchers for initializers may not be possible. |
Should we be discussing deprecating the non-bindable matchers for lambda captures? I guess the part that worries me is there are now two ways to match the same sorts of constructs, and I'm a bit worried it won't be clear which one to reach for and when. Do you think this will be a usability concern for users?
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
4629–4630 |
Oof, that'd be confusing! :-(
Er, but there are init captures where you can introduce a whole new declaration. I think we do want to be able to match on that, right? e.g., [x = 12](){ return x; }();
That's a bummer! :-( If this turns out to be a limitation, we should probably document it as such. |
I agree that having two ways to match the same thing is a usability concern and could definitely be confusing. Deprecating non-bindable matchers could be a possibility and is probably the right way to go if we choose to include these matchers, but I'm not sure of if eventually removing these other matchers will have any side effects.
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
4629–4630 | For the example you've provided, these can be matched with the refersToVarDecl matcher, as seen in the test LambdaCaptureTest_BindsToCaptureWithInitializer. I've gone ahead and updated the documentation to include an example with an initializer. Having that limitation with initializer representation is definitely a concern, though. Looking through the source for the LambdaCapture class, the documentation for the DeclAndBits (line 42-48) suggests that there isn't a distinguishment between the two cases. However, do you think it's feasible to update the classes related to LambdaCapture obtain and store this information (possibly through updating the LambdaCaptureKind enum, updating the constructor/fields of the class, etc)? |
In general, I'm happy with this. Adding @sammccall in case I've missed anything regarding the AST matcher internals or design concerns.
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
4629–4630 |
I think that would make sense (thought perhaps as an orthogonal patch). That we don't track init captures seems like a deficiency of the AST to me. |
This looks really sensible to me too. Some naming bikeshedding, but my main question is migration.
Supporting two things is indeed annoying and confusing. I agree it's not worth keeping the old way forever just to avoid writing refersToVarDecl.
The question is: how soon can we get rid of it? We should consider whether we can do it in this patch: just replace the old matcher with this one.
AFAICT the old matcher has no uses in-tree beyond tests/registration. FWIW it has no uses in our out-of-tree repo either (which generally is a heavy matchers user).
I'm sure it has users somewhere, but probably few, and the mechanical difficulty of updating them is low. Given that I think we should just break them, rather than deal with overloading and deprecation warnings and future cleanup just to put off breaking them for one more release cycle.
This is a tradeoff, to me it seems but reasonable people can disagree on the importance of stability. Aaron, happy to go with whatever you decide.
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
4632 | I think capturesVar may be a clearer/more direct name. For example given int y; [x(y)] { return x; } I would naively expect refersToVarDecl(hasName("y")) to match the capture. | |
4652 | Again, why refersToThis rather than capturesThis, which is more specific and matches the AST? |
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
4652 | Initially, I think I saw that there were a few TemplateArgument matchers of the form refersToX, so I wanted to maintain some consistency with that. But I think I your suggestion makes things clearer, so will opt for that. |
I did some looking around to see if I could find any uses of the matcher in the wild, and I can't spot any that aren't in a Clang fork. I think there's plenty of time for us to get feedback from early adopters if removing the old interface causes severe pain for anyone. So I'm on board with replacing the old APIs, but we should definitely have a release note explaining what's happening.
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
4629–4630 | Yeah, this seems orthogonal, if quite desirable. I imagine this will be a frustration in the future. :( But, fixing the AST itself seems like quite a lot of work. | |
4630–4632 | update to new matcher name. | |
4650 | ||
clang/unittests/ASTMatchers/ASTMatchersNarrowingTest.cpp | ||
4449 | rename? | |
clang/unittests/ASTMatchers/ASTMatchersNodeTest.cpp | ||
2248 | and below as well. |
This is great!
clang/docs/LibASTMatchersReference.html | ||
---|---|---|
8394 | I think this should be "matches x and x = 1 |
Thanks for the changes, this looks great to me now.
clang/docs/LibASTMatchersReference.html | ||
---|---|---|
8394 | either that or ... hasName("x") matches ... Maybe it's worth explicitly saying in the doc text something like "this can be a separate variable captured by value or reference, or a synthesized variable if the capture has an initializer". (There may end up being some confusion about the fact that these variables don't appear in AST dumps, but I don't think that's your problem to deal with here) |
That might be good follow-on work (I wouldn't insist on it for this patch though).
clang/docs/ReleaseNotes.rst | ||
---|---|---|
223–229 ↗ | (On Diff #384578) | We should have an additional note about the removal of the old matchers. |
Completely agreed, just something that occurred to me as the next thing I will need when building on this.
clang/docs/ReleaseNotes.rst | ||
---|---|---|
223–229 ↗ | (On Diff #384578) | I believe that the only change made to the old matchers is that hasAnyCapture now accepts an inner matcher of type Matcher<LambdaCapture>, and no longer accepts inner matchers of type Matcher<CXXThisExpr> or Matcher<VarDecl>. I've revised this note and broken it up into two points for clarity. |
@fowles @aaron.ballman I'll take a look at forEachCapture in the next patch. Also, I've discovered that the VarDecl node has a member function isInitCapture that seems like it could allow us to distinguish between captures with and without initializers (link) - I want to play around with that, as well.
@aaron.ballman Just wanted to confirm with you that the work here and release notes look good and can be wrapped up so that I can have @ymandel submit!
I think this should be "matches x and x = 1