This is an archive of the discontinued LLVM Phabricator instance.

[AST] AST structural equivalence to work internally with pairs.
ClosedPublic

Authored by balazske on Aug 21 2019, 7:55 AM.

Details

Summary

The structural equivalence check stores now pairs of nodes in the
'from' and 'to' context instead of only the node in 'from' context
and a corresponding one in 'to' context. This is needed to handle
cases when a Decl in the 'from' context is to be compared with
multiple Decls in the 'to' context.

Event Timeline

balazske created this revision.Aug 21 2019, 7:55 AM
balazske marked an inline comment as done.Aug 22 2019, 1:03 AM
balazske added inline comments.
clang/lib/AST/ASTStructuralEquivalence.cpp
1586

It looks like that the original code is correct in the decision of structural equivalence of the original pair. If we have an (A,B) and (A,C) to compare, B and C are in different decl chain, then (A,B) or (A,C) will be non-equivalent (unless B and C are equivalent, but what to do in this case?). The problem was that the code assumed that in this case always A and B (or A and C) are non-equivalent. If NonEquivalentDecls is not filled in this case (or not used at all) the problem does not exist.

martong added inline comments.Aug 22 2019, 2:37 AM
clang/lib/AST/ASTStructuralEquivalence.cpp
1586

I have a déjá vu feeling. We already had tried to get rid of the redecl chain comparison:
https://github.com/Ericsson/clang/pull/382/files

martong added inline comments.Aug 22 2019, 2:51 AM
clang/unittests/AST/StructuralEquivalenceTest.cpp
1323

This should be EXPECT_NE ! Because void y(A*) is not eq to void y(B*)

balazske marked an inline comment as done.Aug 22 2019, 3:30 AM
balazske added inline comments.
clang/unittests/AST/StructuralEquivalenceTest.cpp
1323

We have no guarantee that every non-equivalent Decl will be collected in NonEquivalentDecls, in this concrete case only the (A1,B2) pair is inserted (this is why the test passes). This line is to be removed, probably the last line too (it is an implementation detail that (A1,B2) is inserted but not even (C1,C2)).

It looks like that the original code is correct in the decision of structural equivalence of the original pair. If we have an (A,B) and (A,C) to compare, B and C are in different decl chain, then (A,B) or (A,C) will be non-equivalent (unless B and C are equivalent, but what to do in this case?). The problem was that the code assumed that in this case always A and B (or A and C) are non-equivalent. If NonEquivalentDecls is not filled in this case (or not used at all) the problem does not exist.

I think we must not update the NonEquivalentDecls cache at that level, because as we've seen (during the discussion of https://reviews.llvm.org/D62131) it may store the wrong pairs.
However, it is still okay to cache the inequivalent decls in one level upper, right after the Finish returns.
See this diff https://github.com/martong/clang/compare/NonEqDecls_cache_in_an_upper_level~...martong:NonEqDecls_cache_in_an_upper_level?expand=1
Right now I started a build on our CI to see if it causes any slowdown.

It looks like that the original code is correct in the decision of structural equivalence of the original pair. If we have an (A,B) and (A,C) to compare, B and C are in different decl chain, then (A,B) or (A,C) will be non-equivalent (unless B and C are equivalent, but what to do in this case?). The problem was that the code assumed that in this case always A and B (or A and C) are non-equivalent. If NonEquivalentDecls is not filled in this case (or not used at all) the problem does not exist.

I think we must not update the NonEquivalentDecls cache at that level, because as we've seen (during the discussion of https://reviews.llvm.org/D62131) it may store the wrong pairs.
However, it is still okay to cache the inequivalent decls in one level upper, right after the Finish returns.
See this diff https://github.com/martong/clang/compare/NonEqDecls_cache_in_an_upper_level~...martong:NonEqDecls_cache_in_an_upper_level?expand=1
Right now I started a build on our CI to see if it causes any slowdown.

This is correct, NonEquivalentDecls can be filled after Finish (in IsEquivalent?) (still we can have similar cache for equivalence too, maybe this is not of so much use). With the new code there is the possibility to get more information about non-equivalence, the NonEquivalentDecls can be filled in Finish too like it is done now, and with not much effort (I think) we can add some dependency relation (a tree structure) to decide which (already visited) Decls become non-equivalent if one non-equivalence is found. If there is a void f(A *, B *) to check the f can be a "parent" of A and B, if A or B is found to be non-equivalent we can set this for f too.

The structural equivalency check works (with this patch) the following way: To check a (A1,A2) pair we collect further (X1,X2) pairs that should be checked and put these in DeclsToCheck (if not already there). The check succeeds if every pair is equivalent. The pairs (A1,X2) and (A1,Y2) are checked independently. A true return value from any of the IsStructurallyEquivalent functions means that we did not found non-equivalence in the internal node data but the pairs in DeclsToCheck should be checked too. Return value of false means a non-equivalence in the internal data only is found with the (X1,X2) pair that is currently checked, therefore we found non-equivalence for the original pair and for (X1,X2) too. Again, with the new code the test CheckCommonEquivalence(D1, D2) && CheckKindSpecificEquivalence(D1, D2) fails only if we found difference in the internal node data and we can be sure that D1 and D2 are non-equivalent. The old code does not have this property.

With the old code (that uses TentativeEquivalences) return value false for IsStructurallyEquivalent for (X1,X2) pair to check can mean that we found a (X1,Y2) to check and X2!=Y2 (decl chain different). But we can not say that (X1,X2) are non-equivalent because it may be that (X1,Y2) are non-equivalent and (X1,X2) are equivalent. (But it is sure that the original pair (A1,A2) is non-equivalent.)

Ok.

I like this patch because it eliminates the need for checking the redeclaration chains.

Seems like it handles cycles and the simple f(A,B) vs f(A,A) cases properly too. (Not talking about caching now, probably we must remove the NonEquivalentDecls cache.)
I've added two new test cases, could you please add them to the patch (maybe with modifications if you find something)?

TEST_F(StructuralEquivalenceCacheTest, Cycle) {
  auto Decls =
      makeTuDecls(
          R"(
            class C;
            class A { C *c; };
            class B {
              int i;
            };
            void x(A *);
            void y(A *);
            class C {
              friend void x(A *);
              friend void y(A *);
            };
          )",
          R"(
            class C;
            class A { C *c; };
            class B {
              int i;
            };
            void x(A *);
            void y(A *);
            class C {
              friend void x(A *);
              friend void y(A *);
            };
          )", Lang_CXX);

  TranslationUnitDecl *TU1 = get<0>(Decls);
  TranslationUnitDecl *TU2 = get<1>(Decls);
  auto *C1 = LastDeclMatcher<CXXRecordDecl>().match(
      TU1, cxxRecordDecl(hasName("C"), unless(isImplicit())));
  auto *C2 = LastDeclMatcher<CXXRecordDecl>().match(
      TU2, cxxRecordDecl(hasName("C"), unless(isImplicit())));

  llvm::DenseSet<std::pair<Decl *, Decl *>> NonEquivalentDecls;
  StructuralEquivalenceContext Ctx(
      C1->getASTContext(), C2->getASTContext(), NonEquivalentDecls,
      StructuralEquivalenceKind::Default, false, false);
  bool Eq = Ctx.IsEquivalent(C1, C2);

  EXPECT_TRUE(Eq);
}

TEST_F(StructuralEquivalenceCacheTest, SimpleNonEq) {
  auto Decls =
      makeTuDecls(
          R"(
            class A {};
            class B {};
            void x(A, A);
          )",
          R"(
            class A {};
            class B {};
            void x(A, B);
          )", Lang_CXX);

  TranslationUnitDecl *TU1 = get<0>(Decls);
  TranslationUnitDecl *TU2 = get<1>(Decls);
  auto *x1 =
      FirstDeclMatcher<FunctionDecl>().match(TU1, functionDecl(hasName("x")));
  auto *x2 =
      FirstDeclMatcher<FunctionDecl>().match(TU2, functionDecl(hasName("x")));

  llvm::DenseSet<std::pair<Decl *, Decl *>> NonEquivalentDecls;
  StructuralEquivalenceContext Ctx(
      x1->getASTContext(), x2->getASTContext(), NonEquivalentDecls,
      StructuralEquivalenceKind::Default, false, false);
  bool Eq = Ctx.IsEquivalent(x1, x2);

  EXPECT_FALSE(Eq);
}

There is a third test which could be useful to test whether there is no faulty cache entries there:

+TEST_F(StructuralEquivalenceCacheTest, DISABLED_NonEq) {
+  auto Decls =
+      makeTuDecls(
+          R"(
+            class A {};
+            class B {
+              int i;
+            };
+            void x(A *);
+            void y(A *);
+            class C {
+              friend void x(A *);
+              friend void y(A *);
+            };
+          )",
+          R"(
+            class A {};
+            class B {
+              int i;
+            };
+            void x(A *);
+            void y(B *);
+            class C {
+              friend void x(A *);
+              friend void y(B *);
+            };
+          )", Lang_CXX);
+
+  TranslationUnitDecl *TU1 = get<0>(Decls);
+  TranslationUnitDecl *TU2 = get<1>(Decls);
+  auto *C1 = LastDeclMatcher<CXXRecordDecl>().match(
+      TU1, cxxRecordDecl(hasName("C"), unless(isImplicit())));
+  auto *C2 = LastDeclMatcher<CXXRecordDecl>().match(
+      TU2, cxxRecordDecl(hasName("C"), unless(isImplicit())));
+  auto *x1 =
+      FirstDeclMatcher<FunctionDecl>().match(TU1, functionDecl(hasName("x")));
+  auto *x2 =
+      FirstDeclMatcher<FunctionDecl>().match(TU2, functionDecl(hasName("x")));
+
+  llvm::DenseSet<std::pair<Decl *, Decl *>> NonEquivalentDecls;
+  {
+    StructuralEquivalenceContext Ctx(
+        C1->getASTContext(), C2->getASTContext(), NonEquivalentDecls,
+        StructuralEquivalenceKind::Default, false, false);
+    EXPECT_FALSE(Ctx.IsEquivalent(C1, C2));
+  }
+
+  // Reuse the cache.
+  {
+    StructuralEquivalenceContext Ctx(
+        C1->getASTContext(), C2->getASTContext(), NonEquivalentDecls,
+        StructuralEquivalenceKind::Default, false, false);
+    EXPECT_TRUE(Ctx.IsEquivalent(x1, x2));
+  }
+}
balazske updated this revision to Diff 216851.Aug 23 2019, 8:04 AM
  • Updated tests.

There is a third test which could be useful to test whether there is no faulty cache entries there:

+TEST_F(StructuralEquivalenceCacheTest, DISABLED_NonEq) {
...
+}

This is somewhat the same check that is done in the current tests when the NonEquivalentDecls is checked to not contain any pairs of equivalent Decls. (Specially this line:

EXPECT_FALSE(isInNonEqCache(
      findDeclPair<FunctionDecl>(TU, functionDecl(hasName("x")))));

)

Hello Balasz,
I have some comments inline.

clang/lib/AST/ASTStructuralEquivalence.cpp
1579

std::pair<Decl *, Decl *> P{D1, D2}?

1888

I would prefer std::tie instead: std::tie(D1, D2) = P;. But it is a matter of taste.

clang/unittests/AST/StructuralEquivalenceTest.cpp
1290

Can we use std::pair instead of std::tuple in this class? The size of tuple is known to be 2 and it will help to avoid such conversions.

1291

return NonEquivalentDecls.count({get<0>(D), get<1>(D)});?

1314

It seems to me that the assertion will become a bit easier to read without an intermediate variable. What do you think?

1357

The declarations of C are not equivalent, but they are not placed into the non-equivalence cache. This looks strange to me, could you please explain this behavior?

1399

We don't have any tests where isInNonEqCache() can return true. Is it expected?

balazske marked 2 inline comments as done.Aug 26 2019, 12:40 AM

I remember that we had "problems" with nonsense ODR warnings where both declarations are the same. Probably the reason for it was this cache problem. Still the NonEquivalentDecls as cache is useful to filter out the non-equivalences that were already encountered to not produce repeated warnings for them.

clang/unittests/AST/StructuralEquivalenceTest.cpp
1357

We can not know which declarations go into the NonEquivalentDecls (at least I do not expect it from this "interface"). It is only meaningful to test that there are no wrong (equivalent) values in it. I think users of this function should not rely on what exactly is put into the NonEquivalentDecls because it is "implementation dependent". Currently the first encountered non-equivalence (during the internal checks) is put into it only, that is here the A and B. It can be a future work to put every encountered non-equivalent declaration into the cache but even that code will find these only until the first non-equivalence is encountered.

1399

Yes, the specification for IsEquivalent should say that it returns if the declaratiopns are equivalent and puts something into the NonEquivalentDecls that is not equivalent. (This "something" is the first encountered non-equivalence and for this the warning message may be produced.)

balazske updated this revision to Diff 217105.Aug 26 2019, 3:24 AM
  • Small update to newer C++ syntax, use std::pair in test.

There is a third test which could be useful to test whether there is no faulty cache entries there:

+TEST_F(StructuralEquivalenceCacheTest, DISABLED_NonEq) {
...
+}

This is somewhat the same check that is done in the current tests when the NonEquivalentDecls is checked to not contain any pairs of equivalent Decls. (Specially this line:

EXPECT_FALSE(isInNonEqCache(
      findDeclPair<FunctionDecl>(TU, functionDecl(hasName("x")))));

)

Ok, that looks good.

martong accepted this revision.Aug 30 2019, 3:52 AM
This revision is now accepted and ready to land.Aug 30 2019, 3:52 AM
a_sidorin accepted this revision.Sep 2 2019, 12:01 AM

Looks good, thank you for addressing the comments!

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 2 2019, 4:03 AM