This allows the matching of two specific statements in sequence within
a compound statement. For instance if (x) return true; return false;
can be matched as
compoundStmt(hasSubstatementSequence(ifStmt(), returnStmt()))
Differential D116518
[ast-matchers] Add hasSubstatementSequence matcher LegalizeAdulthood on Jan 2 2022, 9:19 PM. Authored by
Details
This allows the matching of two specific statements in sequence within compoundStmt(hasSubstatementSequence(ifStmt(), returnStmt()))
Diff Detail
Unit Tests Event TimelineComment Actions Improve matching when the first matcher matches multiple times,
Comment Actions Thanks for looping me in. I'll try to take a detailed look later today. In the meantime, I'll note that we have something similar internally which I never got around to upstreaming. However, we chose to support arbitrarily many matchers, with this interface: const clang::ast_matchers::internal::VariadicFunction< clang::ast_matchers::internal::Matcher<clang::CompoundStmt>, internal::SequenceMatcher<clang::Stmt>, internal::hasSubstatementSequenceFunc> hasSubstatementSequence = {}; The SequenceMatcher API is: // The following definitions all support the `hasSubstatementSequence` // matcher. This matcher supports describing the series of statements in a // compound statement, in a style inspired by regular expressions. Unlike // regular expressions, however, these operators are deterministic. Choices are // tried in order. For optional-style operators (`maybeOne`, `zeroOrMore` and // `oneOrMore`) the positive choice is considered first. template <typename T> internal::SequenceMatcher<T> exactlyOne( clang::ast_matchers::internal::Matcher<T> Matcher) { return {std::move(Matcher)}; } template <typename T> internal::SequenceMatcher<T> maybeOne( clang::ast_matchers::internal::Matcher<T> Matcher) { return {internal::SequenceElementKind::ZeroOrOne, std::move(Matcher)}; } template <typename T> internal::SequenceMatcher<T> zeroOrMore( clang::ast_matchers::internal::Matcher<T> Matcher) { return {internal::SequenceElementKind::ZeroOrMore, std::move(Matcher)}; } template <typename T> internal::SequenceMatcher<T> oneOrMore( clang::ast_matchers::internal::Matcher<T> Matcher) { return {internal::SequenceElementKind::OneOrMore, std::move(Matcher)}; } I also implemented a non-deterministic version using backtracking. But, that scared me off because of its potential cost (potential, b/c we could use memoization like packrat parsing to avoid the exponential). That said, my experience indicates that once you're thinking in terms of sequences, you're probably going to find that you want to match over the CFG, rather than the AST directly. Comment Actions That's very interesting and is certainly more general than what I have
Is there any tooling for matching the CFG currently? I haven't come across Comment Actions Sorry for the delay. Read through the whole thread today. I agree with Aaron -- we shouldn't add this until
As I mentioned in my previous reply, I have a working general interface that I shared and whose implementation I'd be happy to upstream. But, that brings us to point 2, for which I'm on the fence. I think it serves a general audience but I also suspect its really just a frustrating "not quite good enough" solution. To the point of opening up matchers to a wider audience and the cost of ASTMatchers.h -- this sounds like a broader issue that we should find a way to address, well beyond this patch. Happy to be in on that discussion, but I don't think I have the cycles now to drive it. :) Regarding CFG matchers -- none that I know of. We've had it as project on our "interesting clang projects" list (in our heads) for a while, but have yet to invest in it. What we *have* started is a dataflow analysis framework: clang/Analysis/FlowSensitive (see the unittests for some example uses). My main use of hasSubstatementSequence was to find pointer variables which could be safely converted to unique_ptr. I converted that into a dataflow analysis built on the new framework and it's a lot cleaner. Comment Actions Clarification: I'm totally fine with it existing privately, my concerns were with a public implementation. Comment Actions OK, I'm going to migrate this to a private matcher in the check A related question is if it is possible to register private matchers Comment Actions Based on discussion, I'm going to move this as a private matcher Therefore, I'm abandoning this review. I look forward to seeing Yitzhak's generalized matcher :) |
How do we extend this to support testing arbitrary sequences of statements? (If this supports two statements, someone will find a need for three, etc).