This is an archive of the discontinued LLVM Phabricator instance.

[libTooling] Add parser for string representation of `RangeSelector`.
ClosedPublic

Authored by ymandel on Jun 15 2020, 12:50 PM.

Details

Summary

This patch adds a parser for a RangeSelector written as a string. The format
is closely based on the way one would right the selector in C++, except that
bound nodes are referred to by their bound identifier, rather than writing them
as strings (that is, no quotes are used). This should enable use of
RangeSelectors from tools like clang-query and web UIs.

Diff Detail

Event Timeline

ymandel created this revision.Jun 15 2020, 12:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2020, 12:50 PM
ymandel updated this revision to Diff 270866.Jun 15 2020, 2:21 PM

Added cmake changes.

gribozavr2 added inline comments.Jun 16 2020, 11:10 AM
clang/include/clang/Tooling/Transformer/Parsing.h
2

Please pad the first line to match the other line (throughout the patch).

28

"string representation"

clang/lib/Tooling/Transformer/Parsing.cpp
30

I'm a bit concerned about the abundance of parsers in Clang tooling: we already have a similar language for dynamic AST matchers. I'm concerned about both code duplication as such, and that there are multiple ways to do something.

Code duplication makes it more difficult to carry out horizontal efforts like improving error reporting.

Multiple ways to do something makes codebase knowledge less reusable. It might also create language discrepancies that users might notice (for example, I don't remember if bind(id) works in dynamic AST matchers or not; we would be better off if range selectors were consistent with that).

I don't think this particular change decisively tips the scale towards refactoring the parser for dynamic AST matchers to be reusable here; however it is an option worth considering. I think we should be thinking about the total cost of ownership of this code.

Some future use cases will also need an embeddable language (like AST matchers for the syntax tree, or parsing stencils from strings).

clang/unittests/Tooling/RangeSelectorTest.cpp
271

I wonder if we could figure out a more direct testing strategy for a parser (that does not necessarily involve using the parsed objects) if we had more advanced parsing infrastructure.

ymandel updated this revision to Diff 271246.Jun 16 2020, 4:49 PM

address comments

ymandel marked 6 inline comments as done.Jun 16 2020, 5:02 PM

Thanks for the review!

clang/lib/Tooling/Transformer/Parsing.cpp
30

Agreed on these points. We'd like to move ahead with this patch, but have made the following changes:

  1. Added a FIXME in the implementation file indicating intent to merge with AST matcher parser code
  2. Modified the accepted language for consistency with AST matchers (and C++ code). Node ids must be surrounded by quotes.

One possible complication is that the stencil format-string parser (forthcoming) will not easily be merged into a lexer-based parser, given that it allows free text for the format string (only the special escape sequences have specified structure). So, we will need to find a way to operate both scannerless and not if we want a unified parser infrastructure. However, we can solve that problem when we come to it.

clang/unittests/Tooling/RangeSelectorTest.cpp
271

Indeed. That would certainly be preferable. Stencils support the toString operator for just such testing support. A generic parse tree would also allow direct testing, but then you still need to find a way to test the conversion from parse-tree to stencil. So, ultimately, you want to have some reflection on the type itself.

I propose that we move RangeSelector to implement MatchComputation rather than MatchConsumer to support toString and better tests. WDYT?

gribozavr2 accepted this revision.Jun 18 2020, 10:10 AM
gribozavr2 added inline comments.
clang/lib/Tooling/Transformer/Parsing.cpp
30

We'd like to move ahead with this patch, but have made the following changes

SGTM.

the stencil format-string parser (forthcoming) will not easily be merged into a lexer-based parser

I'd have to take a closer look to provide a meaningful response, but postponing this discussion to the patch that will be adding this code SGTM.

170

s/matcher/AST matcher/

171

s/accept//

clang/unittests/Tooling/RangeSelectorTest.cpp
271

I propose that we move RangeSelector to implement MatchComputation rather than MatchConsumer to support toString and better tests. WDYT?

+1, makes sense. Initially they were separate abstractions because we thought we didn't need the complexity everywhere, but turns out we do.

This revision is now accepted and ready to land.Jun 18 2020, 10:10 AM
ymandel updated this revision to Diff 271770.Jun 18 2020, 10:29 AM
ymandel marked 3 inline comments as done.

address comments

ymandel updated this revision to Diff 271782.Jun 18 2020, 11:03 AM
ymandel marked 3 inline comments as done.

removed Twine argument per clang-tidy warning.

ymandel closed this revision.Jun 19 2020, 9:58 AM

Committed as revision rG9ca50e887db7.