This is an archive of the discontinued LLVM Phabricator instance.

Add infrastructure to support matcher names.
Needs ReviewPublic

Authored by jcking1034 on Nov 15 2021, 10:24 AM.

Details

Summary

This contains changes to the AST Matcher infrastructure in order to provide
contextual information about each matcher. This information can be useful for
tasks such as performing introspection on matchers.

One example which demonstrates the value of such changes can be seen in
https://reviews.llvm.org/D113943, which allows for the introspection of
matchers to improve the process of debugging matchers.

Diff Detail

Event Timeline

jcking1034 requested review of this revision.Nov 15 2021, 10:24 AM
jcking1034 created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptNov 15 2021, 10:25 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
jcking1034 added inline comments.Nov 15 2021, 10:31 AM
clang/include/clang/ASTMatchers/ASTMatchersInternal.h
1021

Here and below, we have the option to use the name of the matcher (for this, we could use "hasOverloadedOperatorName" instead). However, some of these classes are used in multiple matchers. For example, HasOverloadedOperatorNameMatcher is used in both hasOverloadedOperatorName and hasAnyOverloadedOperatorName, which I'm a bit concerned about.

1409

@ymandel suggested an alternate implementation where we instead pass the matcher name to makeAllOfComposite, which then passes the name to constructVariadic, to avoid making changes to BindableMatcher. I may look into this again, but my previous attempts to try this approach seemed messy due to the fact that these functions are used in a few different places, and because makeAllOfComposite handles cases with 0 or 1 inner matchers without constructing a variadic matcher.

For a little more context: we're working to improve the debuggability of matchers. The first step in that direction seems to be a change of this sort that provides some (dynamic) introspection so that a tool, library, etc. can see some basic information about the matcher.

note that this leaves a fair amount of room for different design decisions. Of all those that we've considered, the most relevant seems to be using an enum for the result of getName (that is, the introspection method). We chose a string so as to easily accomodate user-defined matchers, both those defined with the macros and otherwise. However, we are open to different proposals.

In addition, James is preparing a follow up patch with a new matcher that exploits this functionality to create detailed matcher logging, which is really helpful for debugging matchers during development.

jcking1034 edited the summary of this revision. (Show Details)Nov 15 2021, 2:49 PM

Add tests, rename getMatcherName to getName, update makeAllOfComposite

jcking1034 added inline comments.Nov 17 2021, 2:47 PM
clang/include/clang/ASTMatchers/ASTMatchersInternal.h
1409

Following up on this, I've gone ahead and implemented @ymandel 's approach, as the original implementation led to some issues with getName returning the wrong name.

Looks good, just small comments.

clang/include/clang/ASTMatchers/ASTMatchersInternal.h
88

Please note in the comments that this function template is specialized below to cover most (all) of the AST. So, the default string here should only very rarely (never?) be used.

436

As a pure virtual method, this is defining an API. Please add a comment that explains to users and implementors any expecations/requirements on this method.

1022

here and below: drop the "Matcher" suffix?

1378

here and line 1381: maybe replace "return" with "wrap"? The existing comment is wrong as is and feels even more off with this change.

jcking1034 marked 2 inline comments as done.

Update comments, drop "Matcher" suffix from getNames

jcking1034 marked an inline comment as done.Nov 18 2021, 2:26 PM
jcking1034 added inline comments.
clang/include/clang/ASTMatchers/ASTMatchersInternal.h
1022

Done. I've realized that this is a separate issue, but I'm still somewhat concerned about how the HasOverloadedOperatorNameMatcher is used in two different matchers, such that both hasOverloadedOperatorName and hasAnyOverloadedOperatorName will both have the name "HasOverloadedOperatorName".

hokein added a subscriber: hokein.Nov 23 2021, 2:23 AM

The approach looks fine in general, just some nits when reading through the patch.

clang/include/clang/ASTMatchers/ASTMatchersInternal.h
152

These are types that are not covered in the above gen .inc files. I wonder is there a way to verify this list is complete?

clang/lib/ASTMatchers/ASTMatchersInternal.cpp
120

std::move(MatcherName), and elsewhere.

164

_MatcherName => MatcherName

306

std::move(MatcherName)

jcking1034 marked 3 inline comments as done.

Address comments.

jcking1034 added inline comments.Nov 24 2021, 2:34 PM
clang/include/clang/ASTMatchers/ASTMatchersInternal.h
152

I compared this list to other portions of the code where a similar pattern was used, and have just added specializations for OMP-related nodes. In these other portions of code, I didn't notice anything that verifies completeness, but I agree that this would be preferable.

@hokein @aaron.ballman Following up on this, let me know if there are any other action items to address!

aaron.ballman added inline comments.Dec 14 2021, 12:29 PM
clang/include/clang/ASTMatchers/ASTMatchersInternal.h
88

Should we be asserting that we never use it so that we can have 100% coverage up front and rely on the assertion to catch later issues?

90

Does the return type need to be std::string instead of StringRef? It looks like everything is returning a string literal.

561

Is there a need for this to accept a std::string as opposed to StringRef?

clang/lib/ASTMatchers/ASTMatchersInternal.cpp
130

FWIW, this is the only place I was expecting to see a std::string (places where it's needed for long-term storage).

Aaron - agreed on the points about StringRef vs std::string. But, before I make that change, what did you think of moving to a more general method getMatcherSpec that returns an object? That object will provide the name (for now) and other data in the future (like the enum we discussed in the RFC thread)?

Aaron - agreed on the points about StringRef vs std::string. But, before I make that change, what did you think of moving to a more general method getMatcherSpec that returns an object? That object will provide the name (for now) and other data in the future (like the enum we discussed in the RFC thread)?

I think that's a reasonable approach!

gilbo added a subscriber: gilbo.Mar 16 2022, 2:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 16 2022, 2:57 PM