This is an archive of the discontinued LLVM Phabricator instance.

Introducing clang::tooling::AtomicChange for refactoring tools.
ClosedPublic

Authored by ioeric on Nov 23 2016, 8:44 AM.

Details

Summary

An AtomicChange is used to create and group a set of source edits, e.g.
replacements or header insertions. Edits in an AtomicChange should be related,
e.g. replacements for the same type reference and the corresponding header
insertion/deletion.

An AtomicChange is uniquely identified by a key position and will either be
fully applied or not applied at all. The key position should be the location
of the key syntactical element that is being changed, e.g. the call to a
refactored method.

Next step: add a tool that applies AtomicChange.

Event Timeline

ioeric updated this revision to Diff 79099.Nov 23 2016, 8:44 AM
ioeric retitled this revision from to Introducing clang::tooling::EditList for refactoring tools..
ioeric updated this object.
ioeric added a reviewer: klimek.
ioeric added subscribers: djasper, cfe-commits.

Ping.

Daniel, could you take a look? Manuel and I would like your opinion on this.

djasper added inline comments.Nov 30 2016, 1:57 AM
include/clang/Tooling/Refactoring/EditList.h
41 ↗(On Diff #79099)

I wonder whether we should always use a SourceLocation as key or whether we want to leave this up to the users. E.g. we could make this take a string parameter and provide a

string getKeyForLocation(const SourceManager &SM, SourceLocation KeyPosition);
43 ↗(On Diff #79099)

Does this need to be public? Specifically, couldn't the YamlTraits create a NormalizedEditList and then convertFromYAML can create the EditList object, which should have access to a private constructor?

82–94 ↗(On Diff #79099)

Do we currently actually need these functions? They seem a bit dangerous.

117 ↗(On Diff #79099)

Maybe add a short description of what "Key" contains. The others are pretty intuitive.

lib/Tooling/Refactoring/EditList.cpp
25 ↗(On Diff #79099)

I don't understand (without more comments) why this is normalized/denormalized.

unittests/Tooling/RefactoringTest.cpp
1113

I think these need to be ASSERT_TRUE.

ioeric updated this revision to Diff 79716.Nov 30 2016, 3:22 AM
ioeric marked 4 inline comments as done.
  • Addressed review comments.
include/clang/Tooling/Refactoring/EditList.h
41 ↗(On Diff #79099)

I think the key idea of EditList is that it groups a set of edits around a key position. For refactoring, using position as key makes sense to me (since all edits will be around some position), and I couldn't think of other things that are good to be a key here.

43 ↗(On Diff #79099)

You are right! Make this private now.

82–94 ↗(On Diff #79099)

I think these functions are helpful. Insertion conflict is by far the most common conflict I've seen, and several users have asked for ways to resolve this. And generally, resolving such conflict is not straight-forward. Also notice that these interfaces only resolve insertion conflict, other conflicts (e.g. overlaps) will still raise errors.

Although inserting all texts at one location as a single replacement is preferred, I've seen several tools that use these in some way, i.e. generating all insertions at the same location is hard or impossible.

klimek added inline comments.Nov 30 2016, 4:41 AM
include/clang/Tooling/Refactoring/EditList.h
41 ↗(On Diff #79099)

An idea would be an edit that just inserts multiple headers, but does not directly touch a source location (thus, we might want to key things off of files?).

ioeric added inline comments.Nov 30 2016, 4:56 AM
include/clang/Tooling/Refactoring/EditList.h
41 ↗(On Diff #79099)

We could make the key position the start of the file in this case. But looks like a customize-able key could potentially enable more opportunities, I'll go with Daniel's suggestion. Thanks!

ioeric updated this revision to Diff 79728.Nov 30 2016, 6:01 AM
  • Make key customize-able.
include/clang/Tooling/Refactoring/EditList.h
41 ↗(On Diff #79099)

So I tried the getKeyForLocation way, but we still need FilePath in the constructor to fully initialize an EditList.

I think it might be better if we keep the current constructor (since this will be the most-commonly used one IMO) and adds another constructor that takes FilePath and Key.

djasper added inline comments.Nov 30 2016, 6:17 AM
include/clang/Tooling/Refactoring/EditList.h
41 ↗(On Diff #79099)

Ah, but you might want to include more details in the Key, e.g. the branch or source revision. My point is that the user might be better equipped to know what a valid key might be. But I am happy to keep it as is and add another constructor where we pass in a key when needed.

82–94 ↗(On Diff #79099)

I still have several concerns:

  1. This interface actually exposes two different layers: It has special function to insert text and headers and also the functions that add or merge replacements. IMO, that's not ideal.
  2. There is now no function insert() if users actually want to get the error for a conflict. Yes, they can then instead use addReplacement, but again, that's a completely separate layer of the interface.
  3. The logic of before after AND that this only applies to two conflicting insertions is a bit hard to explain.

Not sure what the right trade-off for all of these is. Maybe we should just remove addReplacement and mergeReplacements and instead add a replace() function? Maybe we should just have a single insert() function with an optional fourth parameter that can be used to control conflict resolution.

lib/Tooling/Refactoring/EditList.cpp
164 ↗(On Diff #79728)

Don't duplicate this code. Add an internal function that takes an extra parameter that controls conflict resolution.

ioeric added inline comments.Nov 30 2016, 6:52 AM
include/clang/Tooling/Refactoring/EditList.h
82–94 ↗(On Diff #79099)

replace + insert sounds like a good way to go. Do we want to support resolving general conflicts besides insertions though?

djasper added inline comments.Nov 30 2016, 6:55 AM
include/clang/Tooling/Refactoring/EditList.h
82–94 ↗(On Diff #79099)

I think that might not be necessary. If so, we might need to see a few use cases to be sure to get it right.

ioeric updated this revision to Diff 79736.Nov 30 2016, 7:30 AM
  • Added replace() and insert() to replace current replacement interfaces.
ioeric updated this revision to Diff 90011.EditedFeb 28 2017, 5:00 AM
  • s/EditList/AtomicChange/
ioeric retitled this revision from Introducing clang::tooling::EditList for refactoring tools. to Introducing clang::tooling::AtomicChange for refactoring tools..Feb 28 2017, 5:01 AM
ioeric edited the summary of this revision. (Show Details)
include/clang/Tooling/Refactoring/AtomicChange.h
60

i assume i might be missing smth - why here and above (in getKey, getFilePath, getError) std::string is returned by value ?

lib/Tooling/Refactoring/AtomicChange.cpp
36

if i am not mistaken this can be done in the intialization list:

Replaces(E.getReplacements().begin(), E.getReplacements().end())
ioeric marked an inline comment as done.Feb 28 2017, 5:30 AM
ioeric added inline comments.
include/clang/Tooling/Refactoring/AtomicChange.h
60

Otherwise, users would need to worry about the lifetimes of the object. And these methods are not expected to be called intensively, so performance is not an issue.

lib/Tooling/Refactoring/AtomicChange.cpp
36

You are right.

ioeric updated this revision to Diff 90012.Feb 28 2017, 5:30 AM
ioeric marked an inline comment as done.
  • Addressed comments.
include/clang/Tooling/Refactoring/AtomicChange.h
60

i see, thanks for the explanation, probably not particularly important
i thought that if a caller wanted to get a copy / take ownership
it would just accept it by value, i.e.

auto path = change.getFilePath() /* assuming getFilePath returns const ref */

At the same time if for example we need to check the filepath (or just print it),
not copies would be required:

llvm::errs() << change.getFilePath();
ioeric updated this revision to Diff 90018.Feb 28 2017, 5:52 AM
ioeric marked 3 inline comments as done.
  • Return const refs in getXXX.
include/clang/Tooling/Refactoring/AtomicChange.h
60

I don't really have strong opinion about this. Another thing I was considering is whether the returned value always has an associated object in the class. For example, getError might return Error plus some extra message so that the return string can't be a reference. But these interfaces look pretty stable for now. And your examples make sense. Changed them to const ref instead.

This revision is now accepted and ready to land.Feb 28 2017, 8:34 AM
ioeric edited the summary of this revision. (Show Details)Mar 1 2017, 5:17 AM
This revision was automatically updated to reflect the committed changes.