This is an archive of the discontinued LLVM Phabricator instance.

[ASTMatcher] Correct memoization bug ignoring direction (descendants or ancestors)
ClosedPublic

Authored by loic-joly-sonarsource on May 15 2020, 11:24 AM.

Details

Summary

In ASTMatcher, when we have has(...) and hasParent(...) called with the same internal matcher on the same node, the memoization process will mix-up the two calls because the direction of the traversal is not part of the memoization key.

This patch adds this information.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2020, 11:24 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

This needs rebasing against trunk

klimek added inline comments.May 18 2020, 2:46 AM
clang/lib/ASTMatchers/ASTMatchFinder.cpp
46–49

Nice find! Why don't we need more states though?

  1. wouldn't hasParent() followed by a hasAncestor() also trigger the memoization? (if so, we'd need transitive / non-transitive)
  2. can we trigger a directional match and a non-directional (non-traversal, for example, explicit) match in the same memoization scope?
Godin added a subscriber: Godin.May 18 2020, 9:27 AM

Take into account transitive/non transitive request
Clarify unit tests

This needs rebasing against trunk

I'm doing it

clang/lib/ASTMatchers/ASTMatchFinder.cpp
46–49
  1. Yes, I'm going to add it
  2. Sorry, I did not understand what you mean...
klimek added inline comments.May 19 2020, 12:50 AM
clang/lib/ASTMatchers/ASTMatchFinder.cpp
46–49

Re 2: nevermind, we don't memoize non-transitive matches (other than hasParent / hasChild), right?
In that case, I'm wondering whether the simpler solution is to just not memoize hasParent / hasChild - wouldn't that be more in line with the rest of the memoization?

clang/lib/ASTMatchers/ASTMatchFinder.cpp
46–49

If I correctly understand what you mean, you think that memoizing recursive searches (ancestors/descendants) might not be worth it, and that just memoizing direct searches (parent, child) would be enough.

I really don't know if it's true or not, and I believe that getting this kind of data requires a significant benchmarking effort (which code base, which matchers...). And there might also be other performance-related concerns (for instance, the value of MaxMemoizationEntries, the choice of std::map to store the cache...),

In other words, I think this would go far beyond what this PR is trying to achieve, which is only correcting a bug.

klimek added inline comments.May 20 2020, 7:52 AM
clang/lib/ASTMatchers/ASTMatchFinder.cpp
46–49

What I'm trying to say is:
Currently the only non-transitive matchers we memoize are hasChild and hasParent, which ... seems weird - my operating hypothesis is that back in the day I didn't realize that or I'd have changed it :)

Thus, it seems like a simpler patch is to not memoize if it's not a transitive match.

Sam also had a good idea, which is to change the ASTMatchFinder API to instead of the current:
matchesChildOf
matchesDescendantOf
matchesAncestorOf(..., MatchMode)

create different atoms:
matchesChildOf
matchesDescendantOfOrSelf
matchesParentOf
matchesAncestorOfOrSelf

then hasDescendant(m) would be implemented as hasChild(hasDescendantOrSelf(m)) and the direction would be part of the matcher already.
That as an aside, which I'd propose to not do in the current patch though :)

My proposal for the current patch is to not memoize if we're only matching a single child or parent.

Removing memoization for non recursive matches

loic-joly-sonarsource marked an inline comment as done.Jun 15 2020, 3:45 PM
loic-joly-sonarsource added inline comments.
clang/lib/ASTMatchers/ASTMatchFinder.cpp
46–49

I've applied the change you suggested

klimek accepted this revision.Jun 16 2020, 2:54 AM

LG. Thanks!

This revision is now accepted and ready to land.Jun 16 2020, 2:54 AM

Can someone with write access merge this please?
Thank you!

This revision was automatically updated to reflect the committed changes.