Add a new matcher forEveryDescendant that recursively matches descendants of a Stmt but skips nested callable definitions.
This matcher has same effect as using forEachDescendant and skipping forCallable explicitly but does not require the AST construction to be complete.
Details
- Reviewers
NoQ jkorous t-rasmud malavikasamak aaron.ballman xazax.hun gribozavr - Commits
- rG5be422b49288: [Fix][-Wunsafe-buffer-usage] Add a new `forEachDescendant` matcher that skips…
rG6d140b952805: [Fix][-Wunsafe-buffer-usage] Add a new `forEachDescendant` matcher that skips…
rGb2ac5fd724c4: [-Wunsafe-buffer-usage] Add a new `forEachDescendant` matcher that skips…
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Is the problem forEachDescendant matching statements inside blocks and lambdas? I wonder if this behavior would surprise people, so I think it would be better to:
- Potentially add a template bool parameter to forEachDescendant controlling this behavior.
- Review existing uses because I am not entirely sure if the current behavior is the right default.
Let me try to describe the problem in more detail. The goal is to recursively visit every descendant of a node n but skip all nested callable declarations in n. With forEachDescendant, one needs to explicitly test and skip, using forCallable, if a descendant does not belong to the same "callable" as n. forCallable looks for ancestors of a node through the AST so it requires the AST to be complete. But in our case, the AST is under construction. Given a node, we want the visitor only strictly look at descendants of the node.
For the questions, I think blocks and lambdas are the only cases we will have in analyzing "C/C++" + "blocks" programs. Our plan is to generalize this matcher to be an optional (off-by-default) mode of forEachDescendant once it becomes stable. So this implementation also skips other declarations. So far, this matcher is local to unsafe buffer checks.
clang/lib/Analysis/UnsafeBufferUsage.cpp | ||
---|---|---|
55–56 | Can Node be ever null if the visitor is initiated only be AST_MATCHER_P? |
Yeah, I honestly think that it is the current behavior of forEachDescendant that's counterintuitive and almost never does what the user actually wants. People typically get surprised when it *does* descend into callables, something they didn't even think about.
And the idiom
decl(forEachDescendant(..., forCallable(equalsBoundNode("D"))).bind("D")
is not only clumsy and obscure but also algorithmically slower.
We do need a better name though, the difference between "Each" and "Every" isn't exactly the same as the difference between this matcher's behavior and that matcher's behavior. I was thinking of forEachStmtDescendant, as in it matches "statement-descendants". An even better name could be forEachDescendantInCallable which describes the behavior precisely and also connects to the old idiom for discoverability.
- Review existing uses because I am not entirely sure if the current behavior is the right default.
Yeah that's definitely a good idea.
clang/lib/Analysis/UnsafeBufferUsage.cpp | ||
---|---|---|
51–52 | ^^ | |
60 | llvm:: is unnecessary, we're under using namespace llvm. |
clang/lib/Analysis/UnsafeBufferUsage.cpp | ||
---|---|---|
55–56 | Honestly I do not know the exact answer to your question. I was imagining that an AST node could have a null to be one of its children. Our plan later is to make this matcher a general alternative to forEachDescendant, so I think the check for null here is not over-defensive. |
clang/lib/Analysis/UnsafeBufferUsage.cpp | ||
---|---|---|
55–56 | There can definitely be null children in the AST. Eg. for(;;) {} has null initializer, null condition, null increment, non-null body. I guess this is more about whether RecursiveASTVisitor checks for that automatically before invoking callbacks. |
clang/lib/Analysis/UnsafeBufferUsage.cpp | ||
---|---|---|
55–56 |
The override-ed TraverseDecl method in the base class RecursiveASTVisitor also has a null-check. So I feel like this null-check is needed here. | |
60 | ||
72 |
The clang binary should not depend on ASTMatchers.
I see this isn't new in this patch, but ASTMatchers should only be used from clang-tidy, not from the compiler itself.
ASTMatchers have been baked into clang binary for at least a few years already, actively used by the static analyzer (some backstory in D25429). This warning is, however, probably the first use in clang proper, so they'll be there even if static analyzer support is turned off through cmake flags.
I'm open to a broader discussion. To me it's natural that there exist "analysis-based warnings" that take advantage of advanced analysis tools, and ASTMatchers is just one such tool, definitely not the most expensive one and not the most hazardous one (CFG and flow-sensitive analysis are arguably much scarier in both regards, and they're actively used in essential warnings such as -Wuninitialized).
With respect to this warning, we're also paying a lot of attention to make sure that performance is acceptable for the compiler. This patch is actually an example of our commitment to performance as it flattens the performance of this analysis down to linear time over AST size, something that wasn't previously possible with ASTMatchers.
For relands, you can just use Differential Revision: https://reviews.llvm.org/D138321 instead of Related differential revision: https://reviews.llvm.org/D138329.
Phabricator doesn't recognize Related differential revision:, so it does not connect ef47a0a711f12add401394f7af07a0b4d1635b56 to this differential.
Revert "Revert can be replaced with Reland to emphasize that it is a reland (it can be omitted as well). A reland should include the original commit message. You may additionally mention what issue it fixes.
Is it possible these errors are due to your change: https://lab.llvm.org/buildbot/#/builders/5/builds/30522/steps/13/logs/stdio? If so please consider reverting/fixing.
ASTMatchers have been baked into clang binary for at least a few years already, actively used by the static analyzer (some backstory in D25429). This warning is, however, probably the first use in clang proper, so they'll be there even if static analyzer support is turned off through cmake flags.
I'm open to a broader discussion. To me it's natural that there exist "analysis-based warnings" that take advantage of advanced analysis tools, and ASTMatchers is just one such tool, definitely not the most expensive one and not the most hazardous one (CFG and flow-sensitive analysis are arguably much scarier in both regards, and they're actively used in essential warnings such as -Wuninitialized).
This has been discussed extensively when ASTMatchers were introduced. Project leadership back then (Doug Gregor and iirc Richard Smith) thought ASTMatches weren't a great approach and agreed to have them merged only if they aren't used in clang itself, but only in clang-tools-extra.
^^