This is an archive of the discontinued LLVM Phabricator instance.

[ASTMatchers] Add forCallable(), a generalization of forFunction().
ClosedPublic

Authored by NoQ on May 10 2021, 8:23 PM.

Details

Summary

It additionally covers Objective-C methods and blocks that don't inherit from FunctionDecl.

I'm open to suggestions here. For instance, it might make sense to have three different matchers instead (forFunction(), forBlock(), forObjCMethod()) and anyOf() them. I don't think it's practical though; most of the time you want any callable and if you don't you can always narrow it down with forCallable(functionDecl()) or something like that. I guess it might make sense to implement both approaches.

I'm also open to suggestions with respect to the very fact that such matchers exist in the first place. From existing use cases it looks to me that most of the time (including my use case) they're used to avoid the problem with hasDescendant() (and similar matchers) digging into nested declarations (eg., inspecting the body of a lambda within a function when you only want it to inspect the function itself). I would be totally satisfied with a better alternative for hasDescendant() instead - that only traverses statements; that'd probably be faster as well as more precise and concise. @stephenkelly, IIRC you've voiced some strong opinions on this subject on the mailing list.

Diff Detail

Event Timeline

NoQ created this revision.May 10 2021, 8:23 PM
NoQ requested review of this revision.May 10 2021, 8:23 PM
steveire edited reviewers, added: steveire; removed: stephenkelly.May 11 2021, 2:02 AM

I'm not certain what strong opinions I've voiced on the mailing list about that. Was it just that I think hasDescendant can lead to unexpected matches and a more-specific matcher should always be used when available (such as hasArgument)? I do think has, hasDescendant, hasAncestor etc have their uses and they get more useful with utilities like forFunction.

I don't know anything about objc. Is it possible to have callable decls nested within each other, like in c++? Is it common? forFunction is mostly useful for the very common case of lambdas within functions.

Does it make sense to add an overload for LambdaExpr so that you can write forCallable(lambdaExpr())?

Is there a missing test for forFunction(functionDecl())?

aaron.ballman added inline comments.May 11 2021, 5:34 AM
clang/include/clang/ASTMatchers/ASTMatchers.h
7563

The formatting changes are good, but should be done in a separate NFC commit as they're unrelated.

7583–7585

It looks like you can re-flow the comments a bit, and add lambdas to the description.

7597

You should also update Registry.cpp to expose this via the dynamic matchers.

NoQ added a comment.EditedMay 11 2021, 8:18 PM

I'm not certain what strong opinions I've voiced on the mailing list about that. Was it just that I think hasDescendant can lead to unexpected matches and a more-specific matcher should always be used when available (such as hasArgument)?

Yeah probably :D So, anyway, you sound like the right person to ask: do you have any immediate thoughts on a variant of hasDescendant that doesn't lead to unexpected matches (or leads to less unexpected matches) because it stops exploring insides of nested declarations? Would that be useful in your endeavors?

I do think has, hasDescendant, hasAncestor etc have their uses and they get more useful with utilities like forFunction.

I don't know anything about objc. Is it possible to have callable decls nested within each other, like in c++? Is it common? forFunction is mostly useful for the very common case of lambdas within functions.

ObjC class/interface declarations or method implementations cannot be nested into functions. Blocks, however, are like lambdas, they can appear anywhere (in fact, blocks and lambdas are implicitly convertible into each other). Additionally, in Objective-C++ you can have lambdas nested anywhere including methods or blocks and you can still have nested C++ classes everywhere.

Does it make sense to add an overload for LambdaExpr so that you can write forCallable(lambdaExpr())?

Dunno, maybe but I've never encountered such use-cases. If at all, the same problem applies to forFunction() as well.

Is there a missing test for forFunction(functionDecl())?

Uhm yeah, I should add them. There's some code duplication so it's worth it to at least duplicate the tests.

clang/include/clang/ASTMatchers/ASTMatchers.h
7597

Uh-oh, i knew i was forgetting something!

Do we have a checklist? (Do we want to?)

NoQ updated this revision to Diff 344646.May 11 2021, 8:35 PM

Address comments. Yay, diff is no longer confused!

I honestly don't see a reason why it's not a part of forFunction. forFunction matches C++ methods and lambdas, Obj-C methods and blocks don't seem that much more special to have an extended matcher just for those.
I really don't think that it will break someone's workflow and that someone really relied on the fact that forFunction doesn't match those.
Other than that it's a great patch IMO!

I honestly don't see a reason why it's not a part of forFunction. forFunction matches C++ methods and lambdas, Obj-C methods and blocks don't seem that much more special to have an extended matcher just for those.
I really don't think that it will break someone's workflow and that someone really relied on the fact that forFunction doesn't match those.

I think this is a good observation -- if you modify forFunction(), do you see any test failures that look unexpected when running all the clang-tidy tests?

NoQ added a comment.EditedMay 12 2021, 10:05 AM

I totally agree that changing forFunction() to work correctly is the right solution but backwards compatibility breakage is very real because forFunction() accepts a Matcher<FunctionDecl> whereas forCallable() accepts Matcher<Decl> which is a smaller set of matchers. Eg., forFunction(hasName()) is valid code but forCallable(hasName()) is not because BlockDecl isn't a NamedDecl. So, like, I suspect that not a lot of code expects this to be the function, but a lot of code expects it to be a function. See how much types did I have to change just to get the infinite loop checker to compile in D102214.

If we're ok with such compatibility breakage then i'm all for it.

I totally agree that changing forFunction() to work correctly is the right solution but backwards compatibility breakage is very real because forFunction() accepts a Matcher<FunctionDecl> whereas forCallable() accepts Matcher<Decl> which is a smaller set of matchers. Eg., forFunction(hasName()) is valid code but forCallable(hasName()) is not because BlockDecl isn't a NamedDecl. So, like, I suspect that not a lot of code expects this to be the function, but a lot of code expects it to be a function. See how much types did I have to change just to get the infinite loop checker to compile in D102214.

If we're ok with such compatibility breakage then i'm all for it.

That's a fair point. We try to not let changes break existing code, but we also don't make any stability guarantees that prevent us from doing so when that's the right answer. Would this breakage be loud for users of the C++ interface, or would the behavior silently change so that anyone who fails to add a namedDecl() in the right place stops getting expected results?

NoQ added a comment.May 12 2021, 12:16 PM

The breakage is loud; the code will no longer compile when the intermediate decl() (or namedDecl(), or whatever) is not present. The more annoying part is that when you add namedDecl() back (or if you had it spelled out this way from the start, which doesn't make much sense but is valid and shorter than spelling out functionDecl()), your Node.get<FunctionDecl>() in the match callback will silently start returning null (on anything that isn't a functionDecl()) which may lead to unexpected crashes (previously there was no match at all, now there's a match but the node isn't of the expected type). So a relatively distant piece of code will require manual audit in order to address the potential breakage.

NoQ added a comment.May 12 2021, 12:22 PM

I'm also mildly worried that Function is not the technically correct term. Maybe we should mark the old matcher as deprecated instead?

NoQ added a comment.EditedMay 12 2021, 7:49 PM

Additionally, forCallable(functionDecl()) is not equivalent to forFunction() (the former silently matches less stuff). So anybody who had functionDecl() spelled out explicitly in their code will observe a silent change in behavior. I think that most of the time the old behavior was a footgun (i.e., most of the time when people wrote forFunction() they actually expected it to behave like forCallable(functionDecl())) but I can't be 100% sure.

I totally agree that changing forFunction() to work correctly is the right solution but backwards compatibility breakage is very real because forFunction() accepts a Matcher<FunctionDecl> whereas forCallable() accepts Matcher<Decl>

Yeah, I see now, it's not that straightforward.
Parameter type difference can be combated even keeping backward compatibility (so to speak). You can do AST_MATCHER_P_OVERLOAD (like it's done for hasPrefix, for example) to have two overloads of forFunction: one for Matcher<FunctionDecl> (aka old implementation) and one for Matcher<Decl> for new implementation. I didn't check it, but it probably should work. It is not elegant though and I'm not pushing it.

The breakage is loud; the code will no longer compile when the intermediate decl() (or namedDecl(), or whatever) is not present. The more annoying part is that when you add namedDecl() back (or if you had it spelled out this way from the start, which doesn't make much sense but is valid and shorter than spelling out functionDecl()), your Node.get<FunctionDecl>() in the match callback will silently start returning null (on anything that isn't a functionDecl()) which may lead to unexpected crashes (previously there was no match at all, now there's a match but the node isn't of the expected type). So a relatively distant piece of code will require manual audit in order to address the potential breakage.

The breakage example has a bit weird precondition IMO. In that scenario, the user had to use decl or namedDecl where they actually meant functionDecl AND they should run their matchers on code with blocks. And even if it does happen to someone, I think it's not going to be very painful because a simple dump of the node will show what's going on.

Additionally, forCallable(functionDecl()) is not equivalent to forFunction() (the former silently matches less stuff).

I'm not sure I understand this one. Can you please show an example where one matches something that the other doesn't?

NoQ added a comment.EditedMay 13 2021, 9:44 AM

Additionally, forCallable(functionDecl()) is not equivalent to forFunction() (the former silently matches less stuff).

I'm not sure I understand this one. Can you please show an example where one matches something that the other doesn't?

void foo() {
  ^{
    int x = 1;
  }
}

Here declStmt(forCallable(functionDecl())) doesn't match because the callable the variable belongs to isn't a FunctionDecl but declStmt(forFunction()) matches 'x' for 'foo'.

Additionally, forCallable(functionDecl()) is not equivalent to forFunction() (the former silently matches less stuff).

I'm not sure I understand this one. Can you please show an example where one matches something that the other doesn't?

void foo() {
  ^{
    int x = 1;
  }
}

Here declStmt(forCallable(functionDecl())) doesn't match because the callable the variable belongs to isn't a FunctionDecl but declStmt(forFunction()) matches 'x' for 'foo'.

Ahhh, right! Thanks!

I'm also mildly worried that Function is not the technically correct term. Maybe we should mark the old matcher as deprecated instead?

Thank you for the explanations as to why there's a separate matcher. I think using a separate matcher is the safer approach to introduce the functionality. I'd be fine if we marked the old matcher as deprecated in the documentation for it. (I don't believe that we've added any louder deprecation markings yet aside from comments.)

clang/include/clang/ASTMatchers/ASTMatchers.h
7597

I don't think we have a checklist, but the mental checklist I use is:

  • If there's a change to doc comments in ASTMatchers.h, did the HTML file get regenerated?
  • If there's a new matcher added, did Registry.cpp get updated for it?
  • If there are changes to the list in Registry.cpp, is the list still alphabetical?
  • Testcases?
7647

Might as well fix this clang-format warning.

NoQ updated this revision to Diff 345208.May 13 2021, 10:26 AM
NoQ marked 2 inline comments as done.

Address review comments. Add a deprecation notice to forFunction().

clang/include/clang/ASTMatchers/ASTMatchers.h
7597

Wait, it's autogenerated? I updated it by hand :D
Fxd.

vsavchenko accepted this revision.May 13 2021, 10:30 AM

OK, I agree with Aaron that having a separate matcher is reasonable. Thanks for the patient explanation :)
LGTM from my side then.

This revision is now accepted and ready to land.May 13 2021, 10:30 AM
aaron.ballman accepted this revision.May 13 2021, 10:45 AM

LGTM, thank you!

NoQ updated this revision to Diff 345220.May 13 2021, 10:56 AM

Make examples for forCallable() more relatable given that the original matcher is now deprecated.

NoQ updated this revision to Diff 345221.May 13 2021, 10:58 AM

Fix typo in documentation.

This revision was landed with ongoing or failed builds.May 13 2021, 11:25 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2021, 11:25 AM