This is an archive of the discontinued LLVM Phabricator instance.

[clang-refactor] introducing clang-refactor
AbandonedPublic

Authored by omtcyfz on Sep 2 2016, 7:28 AM.

Details

Reviewers
None
Summary

Adding clang-refactor with a dummy template refactoring module.

The tool itself doesn't have to be perfect in the first iteration, so I would be really happy to push this one early enough so that everyone could start building modules on top of it. As soon as this one is landed more refactoring and cleanup patches are welcome. This also isn't about "the one true way how clang-refactor has to be designed". Thus said, consider this version of clang-refactor to be highly experimental.

Clang-refactor design doc can be found here: https://docs.google.com/document/d/1w9IkR0_Gqmd5w4CZ2t_ZDZrNLYVirQPyMS41533HQZE/edit?usp=sharing

UPD: This diff was stalled for a while, clang-refactor was introduced in https://reviews.llvm.org/D36574 patch by araphman. Removing everyone from subscribers and reviewers for convenience.

Diff Detail

Event Timeline

omtcyfz updated this revision to Diff 70163.Sep 2 2016, 7:28 AM
omtcyfz retitled this revision from to [clang-refactor] introducing clang-refactor.
omtcyfz updated this object.
omtcyfz added reviewers: alexfh, klimek, ioeric, bkramer.
omtcyfz added a subscriber: cfe-commits.
ioeric edited edge metadata.Sep 2 2016, 9:08 AM
  • You mentioned a design doc in the summary; maybe also include a link to it?
  • It would make the review easier if you could separate the migration of clang-rename into another patch...I think clang-refactor is really the interesting part here. And maybe also add a small dummy sub-module as an example demonstrating how to add a new sub-module, which I believe will be helpful for future developers. (See also tool-template which exists for the same purpose.)
TemplatedClassFunction.cpp
1

What is this file for?

  • You mentioned a design doc in the summary; maybe also include a link to it?

Done.

  • It would make the review easier if you could separate the migration of clang-rename into another patch...I think clang-refactor is really the interesting part here. And maybe also add a small dummy sub-module as an example demonstrating how to add a new sub-module, which I believe will be helpful for future developers. (See also tool-template which exists for the same purpose.)

I am not against it, but I do not think that an empty tool, which just introduces the binary without any code would be reasonable. This is basically why I wanted it to be all-in-place.

The whole changelist doesn't introduce anything really new, so I assume it might be fine. But if you are not okay with it I can split.

TemplatedClassFunction.cpp
1

This is a test. It is the reason clang-refactor rename fails :)

omtcyfz updated this object.Sep 2 2016, 10:15 AM
omtcyfz edited edge metadata.
omtcyfz updated this revision to Diff 70191.Sep 2 2016, 10:55 AM
omtcyfz updated this object.

Removed "clang-refactor (as opposed to clang-rename) now uses LLVM policy about brackets around single line statements in if/while/..." change to make the review a bit easier.

omtcyfz updated this revision to Diff 70194.Sep 2 2016, 11:03 AM

Revert diff, as the last one "deletes and creates" files instead of "moving and changing them" in the filesystem.

  • It would make the review easier if you could separate the migration of clang-rename into another patch...

Another point is that if I try to separate the migration - what do I do about USREngine? USREngine is basically the core of clang-refactor at the moment and I can't detach it from clang-rename at the same time.

  • It would make the review easier if you could separate the migration of clang-rename into another patch...

Another point is that if I try to separate the migration - what do I do about USREngine? USREngine is basically the core of clang-refactor at the moment and I can't detach it from clang-rename at the same time.

I'm not sure why USREngine is the core of clang-refactor. It seems to me that USREngine is more closely tied to clang-rename than to clang-refactor. At least USREngine is not essential to all refactor tools, and it is more like a library that sub-modules can use.

  • It would make the review easier if you could separate the migration of clang-rename into another patch...

Another point is that if I try to separate the migration - what do I do about USREngine? USREngine is basically the core of clang-refactor at the moment and I can't detach it from clang-rename at the same time.

I'm not sure why USREngine is the core of clang-refactor. It seems to me that USREngine is more closely tied to clang-rename than to clang-refactor. At least USREngine is not essential to all refactor tools, and it is more like a library that sub-modules can use.

It is essential to all of the tools I wrote about in design doc.

Well, are you proposing to create an "empty" clang-refactor binary in one patch and adding meaningful code in the other? I am not sure if just creating clang-refactor/driver/Driver.cpp with main, which doesn't do anything is a good idea, but if you think it is - I'll do that.

  • You mentioned a design doc in the summary; maybe also include a link to it?

Done.

  • It would make the review easier if you could separate the migration of clang-rename into another patch...I think clang-refactor is really the interesting part here. And maybe also add a small dummy sub-module as an example demonstrating how to add a new sub-module, which I believe will be helpful for future developers. (See also tool-template which exists for the same purpose.)

I am not against it, but I do not think that an empty tool, which just introduces the binary without any code would be reasonable. This is basically why I wanted it to be all-in-place.

The whole changelist doesn't introduce anything really new, so I assume it might be fine. But if you are not okay with it I can split.

Don't worry about a patch being small. Reviewers like small patches :)

The reason that I suggested a dummy sub-tool is to lower the bar for developers, especially those who have never developed a clang tool, so that they can easily figure out how to set up a clang-refactor sub-tool.

For a "swiss-army knife" tool like clang-refactor, I would worry more about the extendibility of the infrastructure instead of functionality at this stage. At least, I would expect it to be much easier to create a refactor tool under clang-refactor than creating a standalone clang tool. For example, it would be really nice if there is a tool base that a sub-module can derive from so that I don't have to go through the whole process of creating a clang tool and setting up unit/lit test environment :)

  • It would make the review easier if you could separate the migration of clang-rename into another patch...

Another point is that if I try to separate the migration - what do I do about USREngine? USREngine is basically the core of clang-refactor at the moment and I can't detach it from clang-rename at the same time.

I'm not sure why USREngine is the core of clang-refactor. It seems to me that USREngine is more closely tied to clang-rename than to clang-refactor. At least USREngine is not essential to all refactor tools, and it is more like a library that sub-modules can use.

It is essential to all of the tools I wrote about in design doc.

Well, are you proposing to create an "empty" clang-refactor binary in one patch and adding meaningful code in the other? I am not sure if just creating clang-refactor/driver/Driver.cpp with main, which doesn't do anything is a good idea, but if you think it is - I'll do that.

It was not trivial to me why USREngine is so important to those tools. You might want to address that in the design doc as well. And given the weight USREngine carries in clang-refactor as you suggested, I think it deserves an more detailed introduction in the design doc.

alexander-shaposhnikov added inline comments.
clang-refactor/driver/Rename.h
193

if i am not mistaken if the file has not been modified this will print smth like
"invalid source location" or similar. Pls, double check me.

It was not trivial to me why USREngine is so important to those tools. You might want to address that in the design doc as well. And given the weight USREngine carries in clang-refactor as you suggested, I think it deserves an more detailed introduction in the design doc.

Yes, I agree, it might be a little confusing. The only mention I have is:

The most important facility would allow lookup of symbol definitions and usages (probably, based on Unified Symbol Resolutions, USR)

But it doesn't get into too much details anyway. However, the idea of the design doc was not to focus on implementation details too much as they are a subject to change.

Thus, I'm not sure whether I should elaborate details of USREngine there.

clang-refactor/driver/Rename.h
193

Sorry, can you please elaborate?

Also, this might be unrelated to this exact patch, but if there is an issue it's good to fix it in another patch. Basically, this is not new code, it's been there in clang-rename.

Don't worry about a patch being small. Reviewers like small patches :)

The reason that I suggested a dummy sub-tool is to lower the bar for developers, especially those who have never developed a clang tool, so that they can easily figure out how to set up a clang-refactor sub-tool.

For a "swiss-army knife" tool like clang-refactor, I would worry more about the extendibility of the infrastructure instead of functionality at this stage. At least, I would expect it to be much easier to create a refactor tool under clang-refactor than creating a standalone clang tool. For example, it would be really nice if there is a tool base that a sub-module can derive from so that I don't have to go through the whole process of creating a clang tool and setting up unit/lit test environment :)

Yes, you are absolutely right. This patch is just a mess, because I tried to push everything at once while most of the stuff is not related to clang-refactor part at all.

I pushed 3 very basic cleanup patches (namely rL280638, rL280639 and rL280640), which don't really need reviewing (and also for the sake of development speed). And I also kept clang-rename improvements separate in D24224, which is up for review.

I'll update this patch tomorrow.

What I'm about to do after is to either:

  1. Make two patches, one of which would basically create "empty" clang-refactor and introduce dummy submodule to show how it works (as you proposed), and the other would merge clang-rename as a submodule.
  2. Make one patch, which both creates clang-refactor binary and merges clang-rename as a submodule.

The first variant seems better to me while I see a couple of issues with that, which makes sticking to the second variant a bit less challenging. We can discuss it tomorr..today and come up with a solution.

Thank you very much for your comments, Eric!

omtcyfz updated this revision to Diff 70373.EditedSep 6 2016, 3:35 AM
omtcyfz updated this object.
omtcyfz removed a reviewer: bkramer.
omtcyfz added subscribers: bkramer, hokein.

Removed whole clang-rename part, only making this patch clang-refactor specific.

Bringing results of an offline discussion with Eric (@ioeric) live.

Eric's point was that this patch should only care about clang-refactor and introduce changes directly related to creating clang-rename. clang-rename and all other tools migration can be done later, which is also good in terms of this patch not growing too large.

Another point was that it should be easier if there would be a class, from which refactoring tools would derive and use it as an interface.

Advantages are as follows:

  • It would be easier to implement multi-TU support as the tools wouldn't simply care about that part while delegating the "hard work" to the common part.
  • It would be easier to implement new tools, because there will be a slightly simpler interface.
  • It would be easier to setup unit tests and other routine, because right now all tools have to implement basic routine for unit tests, for example, which is just copied from the other tools' unit tests with slight changes. Delegating that to the common interface is also nice.

The current diff isn't final, but I wanted to bring it to the reviews, because it gives an overview of what I am doing and if there are any high-level comments to address it's better to get them sooner than later.

hokein added a comment.Sep 6 2016, 6:13 AM

Eric's point was that this patch should only care about clang-refactor and introduce changes directly related to creating clang-rename. clang-rename and all other tools migration can be done later, which is also good in terms of this patch not growing too large.

I'm +1 on making this patch implement a skeleton of clang-refactor, which makes it small, and much easier for review. The patch looks much better now.

clang-refactor/driver/Driver.cpp
11

clang-tidy?

38

s/clan/clang

47

Any reason implementing the command-line options and not using the cl::opt here?

clang-refactor/driver/ModuleManager.h
24 ↗(On Diff #70373)

StringRef.

clang-refactor/modules/core/RefactoringModule.h
43 ↗(On Diff #70373)

s/wouold/would/

54 ↗(On Diff #70373)

"It it" doesn't make sense here.

89 ↗(On Diff #70373)

I'd make this interface a virtual function too, and provide this default implementation.

128 ↗(On Diff #70373)

Does the panic mean something like exit(1)? But from the interface, it looks like returning an error code, right?

146 ↗(On Diff #70373)

A further thought:

applyReplacement is more likely a postprocess step of handleTranslationUnit, what if some clang-refactor subtools just want to dump the results only? I'd prefer to rename it to OnEndOfHandlingTranslationUnit. I'm open to hear better suggestions.

148 ↗(On Diff #70373)

Use "= default".

150 ↗(On Diff #70373)

You can return a StringRef here?

152 ↗(On Diff #70373)

The same.

clang-refactor/modules/template/TemplateModule.h
15 ↗(On Diff #70373)

Not needed.

20 ↗(On Diff #70373)

It'd be clear to add a small paragraph describing that this is a template module which is only used to lower the "barrier to entry" for writing clang-refactor subtool.

omtcyfz updated this revision to Diff 70397.Sep 6 2016, 7:35 AM
omtcyfz marked 11 inline comments as done.

Addressing a round of comments.

Thank you for reviewing, @hokein!

Also, please note that this is not a final version, the interface will change a lot in the upcoming diffs.

clang-refactor/driver/Driver.cpp
47

I might be mistaken, but it'd require parsing whole option sequence, which should be delegated to the submodules.

E.g. "help" should be only called if the the tool is invoked as clang-refactor --help, but if I parse all options I'd also invoke "help" while having "clang-refactor rename --help" for example. At least that's what I have been thinking.

clang-refactor/modules/core/RefactoringModule.h
89 ↗(On Diff #70373)

I'll update the interface soonish.

Had an offline discussion with Alex about the interface and we came up to a conclusion that the interface should be way more strict. This one is just a general idea of how it might be implemented.

Thus, run shouldn't be overridden in my opinion, but I'll update the diff and make some high-level comments on that later.

128 ↗(On Diff #70373)
  1. Well, no. I believe that the error should be thrown to the run and "printed out" there.
  2. Also, not that we're aiming for a multi-TU multi-threaded architecture in the end (although it doesn't make sense with the current interface, but again, I'll update it later).
146 ↗(On Diff #70373)

/* comment about multi-TU multi-threaded stuff */

Suppose a refactoring can't be applied to some translation units. Say, "rename" tool encountered name conflict in any TU. In this case the tool should fail and not process the last step.

If your comment is about naming only, I'll change it, too.

clang-refactor/modules/template/TemplateModule.h
15 ↗(On Diff #70373)

Deleting TemplateModule.

20 ↗(On Diff #70373)

Deleting TemplateModule.

For the moment, just a few nitty-gritty comments inline.
What I miss here is (as already pointed by someone) an example on how to write a new module, register it etc.

clang-refactor/driver/Driver.cpp
42

Simpler:

std::string Invocation = argv[0] + (" " + Command);
clang-refactor/driver/ModuleManager.cpp
22–24 ↗(On Diff #70397)

Using find and then operator[] makes the search twice. IMO, it would be better to avoid that by:

auto it = CommandToModuleID.find(Command);
if (it != CommandToModuleID.end()) {
  return RegisteredModules[*it]->run(argc, argv);
}
clang-refactor/driver/ModuleManager.h
13–19 ↗(On Diff #70397)

clang-format includes (and make it a single block)?

21 ↗(On Diff #70397)

using namespace in header?!

clang-refactor/modules/core/RefactoringModule.h
36 ↗(On Diff #70397)

s/infromation/information

56 ↗(On Diff #70397)

s/occurance/occurrence

72 ↗(On Diff #70397)

s/temprorary/temporary

112 ↗(On Diff #70397)

s/regiestering/registering

alexfh edited edge metadata.Sep 7 2016, 3:54 AM

Kirill, first, thank you for working on this!

As discussed offline, the RefactoringModule interface should explicitly define:

  • separation of execution stages for each check:
    1. preparation or execution planning - should figure out the set of affected translation units and what should be done for each of them
    2. processing of affected translation units - ideally, this should be done in parallel, separate process per TU, and the interface should not prevent this
    3. application of replacements or error reporting, if stage 2 detected that execution of refactoring is not possible - looks like this is not refactoring-specific and can be done by the clang-refactor framework
  • state transferred between stages (since they should allow for independent execution, potentially in multiple processes, so data should be in a serializable form)
  • data and facilities available to each stage (e.g. cross-reference index, AST of a certain translation unit, ability to register PPCallbacks, interface to pass data to the next stage, report results or errors)
  • basic interfaces for the facilities mentioned above.

A failure to define these aspects would complicate introduction of multi-TU support and require significant changes to refactoring tools to follow a stricter interface.

alexfh requested changes to this revision.Sep 7 2016, 4:04 AM
alexfh edited edge metadata.
alexfh added inline comments.
clang-refactor/driver/Driver.cpp
42
(llvm::Twine(argv[0]) + " " + ...).str()
clang-refactor/driver/ModuleManager.h
21 ↗(On Diff #70397)

If you need StringRef, ArrayRef etc. in clang namespace, just include "clang/Basic/LLVM.h".

clang-refactor/modules/core/RefactoringModule.h
91 ↗(On Diff #70397)

I believe, run should be external to the module. Its implementation might be different depending on whether we're just running all stages sequentially or trying to parallelize processing of different translation units.

148 ↗(On Diff #70397)

This should be possible to implement in a refactoring-independent way in clang-refactor itself. Or do you see any issues with this?

This revision now requires changes to proceed.Sep 7 2016, 4:04 AM
omtcyfz updated this revision to Diff 70682.Sep 8 2016, 5:17 AM
omtcyfz edited edge metadata.
omtcyfz marked 8 inline comments as done.

Addressing few comments.

Major improvements on the way.

omtcyfz added inline comments.Sep 8 2016, 5:20 AM
clang-refactor/driver/ModuleManager.cpp
22–24 ↗(On Diff #70397)

The search is O(1) anyway, but the code itself probably becomes more reasonable, thanks.

clang-refactor/driver/ModuleManager.h
13–19 ↗(On Diff #70397)

I like to separate includes into three groups (LLVM includes, STL includes, subproject includes), which I think is not very bad. Haven't seen any policy prohibiting it. Feel free to elaborate, though.

Sorted the includes within the second group.

clang-refactor/modules/core/RefactoringModule.h
148 ↗(On Diff #70397)

Nope, totally agree.

curdeius added inline comments.Sep 8 2016, 5:41 AM
clang-refactor/driver/ModuleManager.h
14–20 ↗(On Diff #70682)

I thought that idea behind sorting includes using clang-format is to avoid handling groups and order manually.
I don't think that there is any policy prohibiting separating includes into groups, but AFAIK, there is one that says that STL includes should be the last (you should include in order from the most specific to the most generic, i.e. subproject, clang, llvm, STL).

omtcyfz updated this revision to Diff 70802.Sep 9 2016, 3:08 AM
omtcyfz edited edge metadata.
omtcyfz marked 2 inline comments as done.

Slightly improve the interface.

Patch is still not complete, though.

omtcyfz updated this revision to Diff 71776.Sep 19 2016, 1:20 AM

The diff doesn't really do much right now, but it kind of gives the idea of what's happening with clang-refactor.

ioeric added inline comments.Sep 19 2016, 2:53 AM
clang-refactor/driver/ModuleManager.h
14–20 ↗(On Diff #70682)

Fortunately, clang-format is smart enough to categorize #include groups (e.g. LLVM includes, STL includes, main header etc). We actually encourage people to combine #includes groups into one block and let clang-format handle the categorization (fyi: clang-format only sort includes within a block). The point is to free you from worrying about the formatting.

arphaman added inline comments.Sep 20 2016, 1:25 PM
clang-refactor/core/SymbolIndexClient.h
31 ↗(On Diff #71776)

End namespace clang?

arphaman added inline comments.Sep 20 2016, 1:55 PM
clang-refactor/driver/Driver.cpp
37

Is there a particular reason why you don't return a value that's returned by Manager.dispatch here? It seems odd to me that Manager.dispatch returns an integer value that's not actually used here.

ioeric added inline comments.Sep 21 2016, 1:18 AM
clang-refactor/driver/Driver.cpp
37

BTW, isn't argv[1] pointing to a temp string above?

ioeric removed a reviewer: ioeric.Jan 17 2017, 12:44 AM
ioeric added a subscriber: ioeric.
alexfh removed a reviewer: alexfh.Sep 21 2017, 8:05 AM
omtcyfz edited the summary of this revision. (Show Details)Sep 21 2017, 4:38 PM
omtcyfz removed a reviewer: klimek.
omtcyfz removed subscribers: ioeric, Alexander_Droste, arphaman and 9 others.
omtcyfz abandoned this revision.Jul 20 2018, 1:48 AM

No longer relevant.

unittests/clang-rename/CMakeLists.txt