This is an archive of the discontinued LLVM Phabricator instance.

[ASTMatcher] Fix a performance regression: memorize the child match.
ClosedPublic

Authored by hokein on Jun 29 2020, 7:43 AM.

Details

Summary

D80025 introduced a performance regression: in some cases, it makes
clang-tidy readability-container-size-empty ~80x slower (running on an internal
huge TU, before that patch 12s vs after 950s).

after this patch, we go back to 12s.

Diff Detail

Event Timeline

hokein created this revision.Jun 29 2020, 7:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2020, 7:43 AM

In what situation are we calling child matchers repeatedly with the same matcher on the same node?

klimek added inline comments.Jun 29 2020, 8:37 AM
clang/lib/ASTMatchers/ASTMatchFinder.cpp
467–468

Here and below: s/memorize/memoize/

hokein updated this revision to Diff 274347.Jun 30 2020, 12:22 AM

Address comments.

hokein marked an inline comment as done.Jun 30 2020, 12:24 AM

In what situation are we calling child matchers repeatedly with the same matcher on the same node?

I guess a pattern like https://github.com/llvm/llvm-project/blob/master/clang-tools-extra/clang-tidy/readability/ContainerSizeEmptyCheck.cpp#L29-L40?

In what situation are we calling child matchers repeatedly with the same matcher on the same node?

I guess a pattern like https://github.com/llvm/llvm-project/blob/master/clang-tools-extra/clang-tidy/readability/ContainerSizeEmptyCheck.cpp#L29-L40?

Ah, so this is when we go hasDeclaration.

I'm wondering whether what we really want to memoize are matchers that go across the AST, like hasType, hasDeclaration, etc, where we can expect a linear number of nodes hitting that matcher.

hasChild specifically seems weird to memoize when we don't memoize other has matchers.

(I can't quite follow the original reason for disabling memoization in the non-recursive case, so this seems reasonable but going to leave that to Manuel who knew the argument)

clang/lib/ASTMatchers/ASTMatchFinder.cpp
468

This line looks a lot like a cache-poisoning bug, until realizing that MaxDepth is never actually used as an integer, it's a boolean - 1 means non-recursive and 0 means recursive. (Note that the recursive call passes MaxDepth, not MaxDepth-1).
I guess we should just fix the type, actually bounding the depth doesn't seem likely to be done anytime soon.

468

er, 1 means non-recursive and anything else means recursive...

klimek accepted this revision.Jun 30 2020, 4:56 AM

LG

clang/lib/ASTMatchers/ASTMatchFinder.cpp
468

That sounds like a historical reason - feel free to change :) (I don't even remember how that came to be).

This revision is now accepted and ready to land.Jun 30 2020, 4:56 AM
This revision was automatically updated to reflect the committed changes.