This is an archive of the discontinued LLVM Phabricator instance.

[ASTImporter] Add basic support for comparing Stmts and compare function bodies
ClosedPublic

Authored by teemperor on Sep 10 2020, 2:52 AM.

Details

Summary

Right now the ASTImporter assumes for most Expr nodes that they are always equal
which leads to non-compatible declarations ending up being merged. This patch
adds the basic framework for comparing Stmts (and with that also Exprs) and implements
the custom checks for a few Stmt subclasses. I'll implement the remaining subclasses
in follow up patches (mostly because there are a lot of subclasses and some of them
require further changes like having GNU language in the testing framework)

The motivation for this is that in LLDB we try to import libc++ source code and some
of the types we are importing there contain expressions (e.g. because they use enable_if<expr>),
so those declarations are currently merged even if they are completely different (e.g.
enable_if<value> ... and enable_if<!value> ... are currently considered equal which is
clearly not true).

Diff Detail

Event Timeline

teemperor created this revision.Sep 10 2020, 2:52 AM
teemperor requested review of this revision.Sep 10 2020, 2:52 AM

Thanks! This patch is really cool, I like it overall.

To make the tests possible I added that doing a structural comparison of FunctionDecls
is now also comparing the function bodies (assuming both FunctionDecls have a body).
This way the tests can just wrap any expression in a function and then we can compare
those. The alternative would be to add a separate API call to compare Stmts or to otherwise
squeeze the expressions into a type that is getting compared which seems rather hacky.

About the tests I think it would be more consistent if makeStmts would return with a pair of statements. Plus, it would be really useful if we could pass the ASTmatchers as parameters as well. This could provide the same flexibility that we already have in case of the tests with Decls, checkout for example LambdaClassesWithDifferentMethods. To achieve this, you have to use the ASTMatcher API directly in makeStmts and you cannot use the FirstDeclMatcher. This is how I imagine a Stmt test:

TEST_F(StructuralEquivalenceStmtTest, BinaryOperator) {
  auto t = makeStmts("void f() { 1 + 1; }", "void f() { 1 + 1; }", binaryOperator(), binaryOperator(), Lang_CXX03);
  EXPECT_TRUE(testStructuralMatch(t));
}

Of course, we need another overload for testStructuralMatch that receives statements.

As a convenience, we could have another makeWrappedStmts function that uses a default matcher (compuntStmt()) and wraps the given source code into a wrapper function as you do now. But this should call into the previously mentioned version of the "pure" makeStmts.
And now the test would have this form:

TEST_F(StructuralEquivalenceStmtTest, BinaryOperator) {
  auto t = makeWrappedStmts("1 + 1", "1 + 1", Lang_CXX03);
  EXPECT_TRUE(testStructuralMatch(t));
}
clang/lib/AST/ASTStructuralEquivalence.cpp
159

I like the idea to encapsulate all this into StmtComparer, nice!

186

Should we check the operands here? Or are they the children()?

209

Good idea!

440

Maybe I am missing something, but why do we need the extra layer of TraverseStmt?
Why can't we call directly the IsStmtEquivalent functions? First on PARENT and then on CLASS?

clang/unittests/AST/StructuralEquivalenceTest.cpp
1457

Probably, we should compare the specific Stmts only, not the FunctionDecls in this case. Refer to my other comment about how we could achieve this.

martong edited reviewers, added: balazske, a_sidorin; removed: a.sidorin, jdoerfert.Sep 11 2020, 12:21 AM
martong added a subscriber: balazske.

Changing reviewers:

  • Removing 'a.sidorin' with dot. Afaik, Alexei's live user is 'a_sidorin' with an underscore, so I added it.
  • Removing 'jdoerfert' because his herald script is ought to match OpenMP related changes, but it matched 'omp' in the word 'compare' in the title.
  • Adding 'balazske' as he's been working recently with this class. @balazske, if you have some time, please take a look, your insight could be useful.
teemperor updated this revision to Diff 291170.Sep 11 2020, 3:10 AM
teemperor edited the summary of this revision. (Show Details)
  • Remove FunctionDecl body comparison workaround.
  • Add makeStmt and makeWrappedStmt with AST matcher parameters.
teemperor marked 5 inline comments as done.Sep 11 2020, 3:20 AM

Thanks for the quick review!

The refactoring to a proper makeStmt really is more consistent than the FunctionDecl workaround. The only small thing that would be different is that Stmt doesn't have any reference to its owning ASTContext unlike a Decl, so I wrapped the returned Stmt in a StmtWithASTContext that takes care of this. StmtWithASTContext isn't a template because we don't seem to need the specific subclass anywhere, but I can also make that a template based on the specific Stmt subclass if that seems more consistent. We could also drop StmtWithASTContext if we're fine with using the member AST0/AST1 members (which gives us the ASTContext we're missing otherwise)

you cannot use the FirstDeclMatcher.

There doesn't seem to be any reference to the Decl class in the FirstDeclMatcher (beside the Decl in the name of course) and the code does in fact work with all node kinds. I can make a follow-up PR to rename give it a 'FirstNodeMatcher' or 'FirstStmtMatcher' alias.

clang/lib/AST/ASTStructuralEquivalence.cpp
186

All Stmts/Exprs in an Expr are usually stored in children, so only Type/Decl references and subclass specific values need to be explicitly checked.

440

I clarified that in the updated documentation, but in essence we first have to get from the generic Stmt type to the specific subclass (e.g. BinaryOperator) then TraverseStmt does all the walking of the class hierarchy from BinaryOperator -> Expr -> Stmt (and calling the respective comparison functions for each class while walking the class hierarchy).

martong accepted this revision.Sep 11 2020, 5:55 AM

Thank you for the changes! This looks good to me now!

The refactoring to a proper makeStmt really is more consistent than the FunctionDecl workaround. The only small thing that would be different is that Stmt doesn't have any reference to its owning ASTContext unlike a Decl, so I wrapped the returned Stmt in a StmtWithASTContext that takes care of this. StmtWithASTContext isn't a template because we don't seem to need the specific subclass anywhere, but I can also make that a template based on the specific Stmt subclass if that seems more consistent. We could also drop StmtWithASTContext if we're fine with using the member AST0/AST1 members (which gives us the ASTContext we're missing otherwise)

Having StmtWithASTContext is a good approach, because in the individual tests we don't really care about the return type of makeStmts, we just pass that to testStructuralMatch.

you cannot use the FirstDeclMatcher.

There doesn't seem to be any reference to the Decl class in the FirstDeclMatcher (beside the Decl in the name of course) and the code does in fact work with all node kinds. I can make a follow-up PR to rename give it a 'FirstNodeMatcher' or 'FirstStmtMatcher' alias.

Yes, a rename would be useful, I vote for 'FirstNodeMatcher'.

clang/lib/AST/ASTStructuralEquivalence.cpp
186

Ok.

440

I see, thanks for the clarification.

clang/unittests/AST/StructuralEquivalenceTest.cpp
84

Good idea!

This revision is now accepted and ready to land.Sep 11 2020, 5:55 AM
balazske accepted this revision.Sep 11 2020, 7:59 AM

Probably it should be tested how this change affects CTU analysis (function body is checked now).

clang/lib/AST/ASTStructuralEquivalence.cpp
187–191

Maybe it looks clearer if a if (!static_cast<bool>(Callee1)) is used first. But is is acceptable in the current form too.

Probably it should be tested how this change affects CTU analysis (function body is checked now).

I queued a build (#1704) in our CI for CTU: http://codechecker-buildbot.eastus.cloudapp.azure.com:8080/job/ctu_pipeline_clang-master-monorepo/
I don't expect much regression because my gut feeling is that it is quite rare to compare two function definitions. But let's see the results. Raphael would you mind if we wait for the results before committing?

clang/lib/AST/ASTStructuralEquivalence.cpp
452

I have a warning here. Some build-bots with -Werror may not like it.

../../git/llvm-project/clang/lib/AST/ASTStructuralEquivalence.cpp:352:3: warning: control reaches end of non-void function [-Wreturn-type]
   }
   ^
teemperor marked 2 inline comments as done.
  • Added early-exit in CallExpr comparison.
  • Added an unreachable to fix a missing return warning.

Probably it should be tested how this change affects CTU analysis (function body is checked now).

I queued a build (#1704) in our CI for CTU: http://codechecker-buildbot.eastus.cloudapp.azure.com:8080/job/ctu_pipeline_clang-master-monorepo/
I don't expect much regression because my gut feeling is that it is quite rare to compare two function definitions. But let's see the results. Raphael would you mind if we wait for the results before committing?

The job finished successfully, let's land this!
Thanks again!

Herald added a project: Restricted Project. · View Herald TranscriptSep 13 2020, 9:25 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript