This is an archive of the discontinued LLVM Phabricator instance.

[Syntax] Allow to mutate syntax trees
ClosedPublic

Authored by ilya-biryukov on Jul 11 2019, 9:44 AM.

Details

Summary

This patch adds facilities to mutate the syntax trees and produce
corresponding text replacements.

The public interface of the syntax library now includes facilities to:

  1. perform type-safe modifications of syntax trees,
  2. compute textual replacements to apply the modifications,
  3. create syntax trees not backed by the source code.

For each of the three, we only add a few example transformations in this
patch to illustrate the idea, support for more kinds of nodes and
transformations will be done in follow-up patches.

The high-level mutation operations are implemented on top of operations
that allow to arbitrarily change the trees. They are considered to be
implementation details and are not available to the users of the
library.

Diff Detail

Event Timeline

ilya-biryukov created this revision.Jul 11 2019, 9:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2019, 9:44 AM
Herald added a subscriber: mgorny. · View Herald Transcript
gribozavr2 accepted this revision.Dec 13 2019, 8:27 AM

Generally looks good, just nitpicks.

clang/include/clang/Tooling/Syntax/BuildTree.h
24

"Creates syntax trees..."

25

Why a class to contain static functions? Is it like a namespace, or is there some future design intent?..

Or is it for the friendship?

clang/include/clang/Tooling/Syntax/Mutations.h
27

Removes a statement, or replaces it with an empty statement where one is required syntactically.

(provide an explanation why it may sometimes use an empty statement -- I think it is not obvious)

34

Please add the newline.

clang/include/clang/Tooling/Syntax/Tree.h
87

Please clarify what happens to mixed content (some subtrees are from the source code, some are synthesized).

90

Please also document what isOriginal means for syntax nodes that were moved within the tree (re-parented). For example, think of a refactoring that swaps "then" and "else" branches of an "if" statement.

126

Is it worth eagerly computing the "can modify" bit? Mapping expanded tokens to spelled is log(n) after all, and doing it for every syntax node can add up to nontrivial costs...

174

s/Before/BeforeBegin/? This way begin/end pairing is more obvious.

clang/lib/Tooling/Syntax/BuildTree.cpp
661

I can imagine that both building the syntax tree from the AST, and synthesizing syntax trees from thin air will eventually grow pretty long. So I'd like to suggest to split synthesis of syntax trees into a separate file, even though it will be pretty short today.

676

Please add the newline below.

clang/lib/Tooling/Syntax/ComputeReplacements.cpp
21

I'd probably change this struct into a function with a local lambda... I feel like it is weird to call a constructor just for the side-effects.

54

BeginOfSpan/EndOfSpan or SpanBegin/SpanEnd sound more natural to me.

80

Why not a reference for the TU?

87

s/RemovedRange/ReplacedRange/ ?

91

Unclear whether this comment is explaining the assumption, or explaining the assert.

"Check that the removed range points..."

93

This assertion duplicates the one in rangeOfExpanded; do we need to duplicate?

The code in this function does not depend on the assumption that tokens are expanded tokens; it is rangeOfExpanded that does.

If you decide to remove the assertion here, please move the comment from here into that function.

125

Please add the newline.

clang/lib/Tooling/Syntax/Tree.cpp
89

Please mention in the doc comment that New can be null.

clang/unittests/Tooling/Syntax/TreeTest.cpp
57

Seems like these first/last helpers should be methods on syntax::Node.

66

The first and the last tokens are not necessarily from the same buffer...

130

"deepest node" (singular)

698

I know this is 100% bikeshedding, but why "X"? I'd suggest "I" for "immutable". Feel free to ignore this comment though...

703

Could you also do something like:

#define OPEN {
#define CLOSE }

void test1() {
  OPEN
    1;
  CLOSE
}
void test1() {
  OPEN
    1;
  }
}
This revision is now accepted and ready to land.Dec 13 2019, 8:27 AM
ilya-biryukov marked 24 inline comments as done.
  • Addressed most comments
  • Rebased onto HEAD
ilya-biryukov added inline comments.Dec 17 2019, 12:03 PM
clang/include/clang/Tooling/Syntax/BuildTree.h
25

Good point, this pattern is terrible.
It is for friendship, but this is supposed to be an implementation detail, therefore shouldn't affect user interface.

I turned these into free-standing functions. Not sure if we should group those into a namespace for simpler discoverability via code completion (although one can just do syntax::create^ to get the same effect now)

clang/include/clang/Tooling/Syntax/Tree.h
126

I would expect this bit to be requested quite often, so I think this would actually pay off in the long run. Requesting this bit should be very cheap.

Note that we do quite a bit of O(log N) searches in the same code that sets CanModify bit, so it does not actually make the complexity worse, but definitely makes the constant larger.

If this starts being the bottleneck (I bet this will happen eventually), I believe we can compute it eagerly with O(1) amortized cost. That would involve traversing the AST in the true source code order, which will require tweaking the RecursiveASTVisitor and is much easier done when we have good test coverage with existing implementation.

clang/lib/Tooling/Syntax/BuildTree.cpp
661

Good point.
I've moved only the implementation for now, declaration of synthesis functions are still in BuildTree.h as it seems small enough for now.

I can totally imagine us moving those to a separate file later, though.

clang/lib/Tooling/Syntax/ComputeReplacements.cpp
80

All accessors on syntax tree return pointers (almost everything can be null), so using pointers everywhere just makes the code more uniform.

This one does look a bit different, though, agree that using a reference in this public API is probably more natural.

93

Good point, thanks. Moved the comment to rangeOfExpanded, we don't need to assert in both places.

Unit tests: pass. 60958 tests passed, 0 failed and 726 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

ilya-biryukov marked 6 inline comments as done.
  • Fix a header guard
  • Make firstLeaf and lastLeaf methods inside Tree
  • Mark non-modifiable nodes with I (for immutable)
ilya-biryukov marked an inline comment as not done.Dec 18 2019, 2:50 AM
ilya-biryukov added inline comments.
clang/unittests/Tooling/Syntax/TreeTest.cpp
57

Good point. Done.

66

They are for nodes with isOriginal() == true. I've added an assertion.
Exactly the reason why this method is not a good fit for public API, but ok to have in tests.

703

Funnily enough, this causes an assertion failure, because binary-searching with isBeforeInTranslationUnit finds { expanded from OPEN instead of 1 when building a syntax tree.

I'll make use of a hash table for searching tokens by location and add the test in the follow-up patch.

This revision was automatically updated to reflect the committed changes.
ilya-biryukov marked 3 inline comments as done.Dec 18 2019, 3:25 AM
ilya-biryukov added inline comments.
clang/unittests/Tooling/Syntax/TreeTest.cpp
703

c1bbefef9d36e84e469513374ef404b9e354b262 adds the corresponding test.

ilya-biryukov marked an inline comment as done.Dec 18 2019, 3:25 AM

Updated version LGTM, thanks!

Ka-Ka added a subscriber: Ka-Ka.Dec 20 2019, 2:54 AM
Ka-Ka added inline comments.
clang/lib/Tooling/Syntax/Mutations.cpp
75

This line introduce a buildbot fail:

/home/buildbots/ppc64be-clang-multistage-test/clang-ppc64be-multistage/llvm/clang/lib/Tooling/Syntax/Mutations.cpp:74:13: warning: unused variable ‘Parent’ [-Wunused-variable]

clang/lib/Tooling/Syntax/Tokens.cpp
70

This line introduce a buildbot failure:

/home/buildbots/ppc64be-clang-multistage-test/clang-ppc64be-multistage/llvm/clang/lib/Tooling/Syntax/Tokens.cpp:70:53: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]

Thanks for the fixes!