This is an archive of the discontinued LLVM Phabricator instance.

Add `LambdaCapture`-related matchers.
ClosedPublic

Authored by jcking1034 on Oct 25 2021, 2:12 PM.

Details

Summary

This provides better support for LambdaCaptures by making them first-
class and allowing them to be bindable. In addition, this implements several
LambdaCapture-related matchers. This does not update how lambdas are
traversed. As a result, something like trying to match lambdaCapture() by
itself will not work - it must be used as an inner matcher.

Diff Detail

Event Timeline

jcking1034 requested review of this revision.Oct 25 2021, 2:12 PM
jcking1034 created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptOct 25 2021, 2:12 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I'm a bit confused as to why this is necessary. We already have lambdaExpr(hasAnyCapture(varDecl(hasName("whatever")))) and lambdaExpr(hasAnyCapture(cxxThisExpr())), so I'm not certain why we need this as a parallel construct?

clang/include/clang/ASTMatchers/ASTMatchers.h
4629–4630

The name here is a bit unclear -- whether it is the matcher matching int x; or the x from the capture is not clear from the name. The comment suggests it's matching x from the capture, but I think it's actually matching the int x; variable declaration.

Being clear on what's matched here is important when we think about initializers:

void foo() {
  int x = 12;
  auto f = [x = 100](){};
}

and

lambdaExpr(hasAnyCapture(lambdaCapture(refersToVarDecl(hasName("x"), hasInitializer(integerLiteral(equals(100))))))

Would you expect this to match? (This might be a good test case to add.)

Nice! Per Aaron's comment, it's probably worth expanding the patch description to explain the motivation. I assume that the key is making these first-class and therefore bindable?

clang/include/clang/ASTMatchers/ASTMatchFinder.h
170 ↗(On Diff #382123)

Since top-level metching doesn't work, I think we *don't* want to add this overload.

225 ↗(On Diff #382123)

i guess remove this too, if you remove the overload?

clang/include/clang/ASTMatchers/ASTMatchers.h
4595
4597–4605

Maybe add some more examples? e.g. demonstrating that you can match implicit captures seems valuable. For that same code snippet, maybe something like:

... lambdaCapture(refersToVarDecl(hasName("t")))

4629–4630

In a similar vein, do we want a separate matcher on the name of the capture itself? e.g. an overload of hasName? And what about matchers for the initializers? Those don't have to land in this patch, but do you think those would be doable?

jcking1034 edited the summary of this revision. (Show Details)Oct 26 2021, 10:18 AM
jcking1034 marked 3 inline comments as done.

Revert changes to ASTMatchFinder

jcking1034 marked an inline comment as done.

Update comment with example

Update comments and tests

@aaron.ballman for the purpose of these matchers, what @ymandel said is correct - the goal is to allow LambdaCaptures themselves to be bindable.

clang/include/clang/ASTMatchers/ASTMatchers.h
4629–4630

I would expect @aaron.ballman's initializer example to match, and I added a similar test case to the one described. I think that if a capture does not have an initializer, then refersToVarDecl will match on the variable declaration before the lambda. However, if a capture does have an initializer, that initializer itself seems to be represented as a VarDecl in the AST, which is the VarDecl that gets matched.

For that reason, I think we may not need to have a separate matcher on the name of the capture itself. Additionally, since captures with/without initializers are both represented the same way, there may not be a good way to distinguish between them, so matchers for initializers may not be possible.

@aaron.ballman for the purpose of these matchers, what @ymandel said is correct - the goal is to allow LambdaCaptures themselves to be bindable.

Should we be discussing deprecating the non-bindable matchers for lambda captures? I guess the part that worries me is there are now two ways to match the same sorts of constructs, and I'm a bit worried it won't be clear which one to reach for and when. Do you think this will be a usability concern for users?

clang/include/clang/ASTMatchers/ASTMatchers.h
4629–4630

I think that if a capture does not have an initializer, then refersToVarDecl will match on the variable declaration before the lambda. However, if a capture does have an initializer, that initializer itself seems to be represented as a VarDecl in the AST, which is the VarDecl that gets matched.

Oof, that'd be confusing! :-(

For that reason, I think we may not need to have a separate matcher on the name of the capture itself.

Er, but there are init captures where you can introduce a whole new declaration. I think we do want to be able to match on that, right? e.g.,

[x = 12](){ return x; }();

Additionally, since captures with/without initializers are both represented the same way, there may not be a good way to distinguish between them, so matchers for initializers may not be possible.

That's a bummer! :-( If this turns out to be a limitation, we should probably document it as such.

Update comment with additional example.

jcking1034 marked an inline comment as not done.EditedOct 28 2021, 12:15 PM

I agree that having two ways to match the same thing is a usability concern and could definitely be confusing. Deprecating non-bindable matchers could be a possibility and is probably the right way to go if we choose to include these matchers, but I'm not sure of if eventually removing these other matchers will have any side effects.

clang/include/clang/ASTMatchers/ASTMatchers.h
4629–4630

For the example you've provided, these can be matched with the refersToVarDecl matcher, as seen in the test LambdaCaptureTest_BindsToCaptureWithInitializer. I've gone ahead and updated the documentation to include an example with an initializer.

Having that limitation with initializer representation is definitely a concern, though. Looking through the source for the LambdaCapture class, the documentation for the DeclAndBits (line 42-48) suggests that there isn't a distinguishment between the two cases. However, do you think it's feasible to update the classes related to LambdaCapture obtain and store this information (possibly through updating the LambdaCaptureKind enum, updating the constructor/fields of the class, etc)?

Deprecate original overloads of hasAnyCapture.

Regenerate documentation.

aaron.ballman added a subscriber: sammccall.

In general, I'm happy with this. Adding @sammccall in case I've missed anything regarding the AST matcher internals or design concerns.

clang/include/clang/ASTMatchers/ASTMatchers.h
4629–4630

However, do you think it's feasible to update the classes related to LambdaCapture obtain and store this information (possibly through updating the LambdaCaptureKind enum, updating the constructor/fields of the class, etc)?

I think that would make sense (thought perhaps as an orthogonal patch). That we don't track init captures seems like a deficiency of the AST to me.

This looks really sensible to me too. Some naming bikeshedding, but my main question is migration.

Supporting two things is indeed annoying and confusing. I agree it's not worth keeping the old way forever just to avoid writing refersToVarDecl.
The question is: how soon can we get rid of it? We should consider whether we can do it in this patch: just replace the old matcher with this one.

AFAICT the old matcher has no uses in-tree beyond tests/registration. FWIW it has no uses in our out-of-tree repo either (which generally is a heavy matchers user).
I'm sure it has users somewhere, but probably few, and the mechanical difficulty of updating them is low. Given that I think we should just break them, rather than deal with overloading and deprecation warnings and future cleanup just to put off breaking them for one more release cycle.

This is a tradeoff, to me it seems but reasonable people can disagree on the importance of stability. Aaron, happy to go with whatever you decide.

clang/include/clang/ASTMatchers/ASTMatchers.h
4632

I think capturesVar may be a clearer/more direct name.

For example given int y; [x(y)] { return x; } I would naively expect refersToVarDecl(hasName("y")) to match the capture.

4652

Again, why refersToThis rather than capturesThis, which is more specific and matches the AST?

jcking1034 marked 2 inline comments as done.

Update matcher names for clarity.

jcking1034 added inline comments.Oct 29 2021, 12:16 PM
clang/include/clang/ASTMatchers/ASTMatchers.h
4652

Initially, I think I saw that there were a few TemplateArgument matchers of the form refersToX, so I wanted to maintain some consistency with that. But I think I your suggestion makes things clearer, so will opt for that.

This looks really sensible to me too. Some naming bikeshedding, but my main question is migration.

Supporting two things is indeed annoying and confusing. I agree it's not worth keeping the old way forever just to avoid writing refersToVarDecl.
The question is: how soon can we get rid of it? We should consider whether we can do it in this patch: just replace the old matcher with this one.

AFAICT the old matcher has no uses in-tree beyond tests/registration. FWIW it has no uses in our out-of-tree repo either (which generally is a heavy matchers user).
I'm sure it has users somewhere, but probably few, and the mechanical difficulty of updating them is low. Given that I think we should just break them, rather than deal with overloading and deprecation warnings and future cleanup just to put off breaking them for one more release cycle.

This is a tradeoff, to me it seems but reasonable people can disagree on the importance of stability. Aaron, happy to go with whatever you decide.

I did some looking around to see if I could find any uses of the matcher in the wild, and I can't spot any that aren't in a Clang fork. I think there's plenty of time for us to get feedback from early adopters if removing the old interface causes severe pain for anyone. So I'm on board with replacing the old APIs, but we should definitely have a release note explaining what's happening.

+1 to removing the old versions of these matchers.

ymandel added inline comments.Nov 1 2021, 6:28 AM
clang/include/clang/ASTMatchers/ASTMatchers.h
4629–4630

Yeah, this seems orthogonal, if quite desirable. I imagine this will be a frustration in the future. :( But, fixing the AST itself seems like quite a lot of work.

4630–4632

update to new matcher name.

4650
clang/unittests/ASTMatchers/ASTMatchersNarrowingTest.cpp
4449

rename?

clang/unittests/ASTMatchers/ASTMatchersNodeTest.cpp
2248

and below as well.

jcking1034 updated this revision to Diff 383823.Nov 1 2021, 9:56 AM
jcking1034 marked 4 inline comments as done.

Update missed names; remove original implementations of hasAnyCapture.

This is great!

clang/docs/LibASTMatchersReference.html
8370

I think this should be "matches x and x = 1

sammccall accepted this revision.Nov 2 2021, 6:25 AM

Thanks for the changes, this looks great to me now.

clang/docs/LibASTMatchersReference.html
8370

either that or ... hasName("x") matches ...

Maybe it's worth explicitly saying in the doc text something like "this can be a separate variable captured by value or reference, or a synthesized variable if the capture has an initializer".

(There may end up being some confusion about the fact that these variables don't appear in AST dumps, but I don't think that's your problem to deal with here)

This revision is now accepted and ready to land.Nov 2 2021, 6:25 AM
jcking1034 updated this revision to Diff 384114.Nov 2 2021, 8:15 AM

Update documentation for capturesVar matcher.

jcking1034 marked 2 inline comments as done.Nov 2 2021, 8:17 AM
jcking1034 updated this revision to Diff 384578.Nov 3 2021, 1:57 PM

Rebase onto main.

fowles added a comment.Nov 3 2021, 2:07 PM

Do we also want a forEachCapture matcher?

Do we also want a forEachCapture matcher?

That might be good follow-on work (I wouldn't insist on it for this patch though).

clang/docs/ReleaseNotes.rst
223–229

We should have an additional note about the removal of the old matchers.

fowles added a comment.Nov 4 2021, 6:23 AM

That might be good follow-on work (I wouldn't insist on it for this patch though).

Completely agreed, just something that occurred to me as the next thing I will need when building on this.

Update release notes.

jcking1034 marked an inline comment as done.Nov 4 2021, 11:46 AM
jcking1034 added inline comments.
clang/docs/ReleaseNotes.rst
223–229

I believe that the only change made to the old matchers is that hasAnyCapture now accepts an inner matcher of type Matcher<LambdaCapture>, and no longer accepts inner matchers of type Matcher<CXXThisExpr> or Matcher<VarDecl>. I've revised this note and broken it up into two points for clarity.

jcking1034 marked an inline comment as done.Nov 4 2021, 4:42 PM

@fowles @aaron.ballman I'll take a look at forEachCapture in the next patch. Also, I've discovered that the VarDecl node has a member function isInitCapture that seems like it could allow us to distinguish between captures with and without initializers (link) - I want to play around with that, as well.

@aaron.ballman Just wanted to confirm with you that the work here and release notes look good and can be wrapped up so that I can have @ymandel submit!

aaron.ballman accepted this revision.Nov 8 2021, 10:30 AM

This continues to LGTM, thank you for it!

This revision was automatically updated to reflect the committed changes.