[Tooling] A new framework for executing clang frontend actions.
Needs ReviewPublic

Authored by ioeric on Jun 16 2017, 2:36 AM.

Details

Summary

This defines a clang::tooling::ToolExecutor interface that can be extended to support different execution plans including standalone execution on a given set of TUs or parallel execution on all TUs in a codebase.

In order to enable multiprocessing execution, tool actions are expected to output result into a ToolResults interface provided by executors. The ToolResults interface abstracts how results are stored e.g. in-memory for standalone executions or on-disk for large-scale execution.

New executors can be registered as ToolExecutorPlugins via the ToolExecutorPluginRegistry. CLI tools can use createExecutorFromCommandLineArgs to create a specific registered executor according to the command-line arguments.

This patch also implements StandaloneToolExecutor which has the same behavior as the current ClangTool interface, i.e. execute frontend actions on a given set of TUs. At this point, it's simply a wrapper around ClangTool at this point.

This is still experimental but expected to replace the existing ClangTool interface so that specific tools would not need to worry about execution.

ioeric created this revision.Jun 16 2017, 2:36 AM
ioeric edited the summary of this revision. (Show Details)Jun 16 2017, 2:37 AM
ioeric removed a subscriber: klimek.Jun 16 2017, 2:40 AM
ioeric updated this revision to Diff 118388.Tue, Oct 10, 7:26 AM
  • Merged with origin/master
ioeric updated this revision to Diff 118812.Thu, Oct 12, 11:09 AM
  • Added ToolResults interface and moved classes to right files.
ioeric retitled this revision from Prototyping tooling::ToolExecutor. to [Tooling] A new framework for executing clang frontend actions..Fri, Oct 13, 1:56 AM
ioeric edited the summary of this revision. (Show Details)
ioeric edited the summary of this revision. (Show Details)Fri, Oct 13, 2:04 AM
ioeric added a subscriber: djasper.
ioeric added a subscriber: cfe-commits.
ioeric updated this revision to Diff 118898.Fri, Oct 13, 4:34 AM
  • Minor cleanups.

This is a good start, a few comments. Also I'd suggest running clang-format on this patch -- I saw a few places violate the code style.

include/clang/Tooling/CommonOptionsParser.h
89–90

... is set to true for clarify.

112

return llvm::StringRef?

include/clang/Tooling/Execution.h
22

Consider moving this comment to ToolExecutor class to make it more discoverable?

48

How about using std::pair<std::string, std::string> type here? It would make it consistent with the allKVResults below.

Also I think we can define an alias KVResult for this type.

49

nit: all

82

nit: execute

lib/Tooling/Execution.cpp
62

Consider we have two plugins, the first iteration fails, and the second one succeeds, the code looks like still treating it as an error here? And if the case where there are more than one executor being created successfully, we just return the first one.

My understanding of the createExecutorFromCommandLineArgs's behavior is to find a proper ToolExecutorPlugin and create a ToolExecutor instance.

Maybe add a command-line flag like --executor=<my_executor_name> to make it straightforward to choose which registered executor is used, and make StandaloneToolExecutor as default?

74

nit: Executor

unittests/Tooling/ToolingTest.cpp
32 ↗(On Diff #118898)

Seems that it is unused.

638 ↗(On Diff #118898)

any reason to make it static?

667 ↗(On Diff #118898)

nit: size_t, the same below.

ioeric updated this revision to Diff 119117.Mon, Oct 16, 1:51 AM
  • clang-format code.
ioeric updated this revision to Diff 119122.Mon, Oct 16, 2:39 AM
ioeric marked 5 inline comments as done.
  • Address review comments.

Thanks for the review!

include/clang/Tooling/CommonOptionsParser.h
112

I'm not a big fan of StringRef as a return value. Callers can simply use StringRef s = parser.getErrorMessage(), if they want to; otherwise, they would need to worry about the live time of s in auto s = parser.getErrorMessage(), if the interface returns a StringRef.

include/clang/Tooling/Execution.h
48

It would make the the interface a bit cumbersome to use - callers would need to make pairs all the time. And converting to pairs is an unnecessary performance overhead IMO. The consistency with AllKVResults is not that important after all.

We don't have a lot usages of std::pair<std::string, std::string> yet. I'll add an alias when needed. Currently, I think it makes sense to keep the result type explicit.

49

I think for-each is a more canonical name? E.g. we have for_each loop in C++.

lib/Tooling/Execution.cpp
62

Yes, we simply return the first plugin that matches.

It's hard to ensure that all executors have mutually exclusive arguments when more executors are added; it should be up to the executors to define unique (or as-unique-as-possible) arguments. For example, a map-reduce-based executor could require a "--mr" option to be set. The factory should simply return the first one that matches.

The point of this helper function is to hide the dispatching of different executors, so I don't think requiring users to specify the executor name is sensible. For example, users should only care about whether to set --mr instead of having to know the existence of different executors.

unittests/Tooling/ToolingTest.cpp
638 ↗(On Diff #118898)

Added a comment explaining this.

667 ↗(On Diff #118898)

createExecutorFromCommandLineArgs and, more importantly, the underlying library functions take int, so... :(

ioeric updated this revision to Diff 119150.Mon, Oct 16, 6:44 AM
  • Add a ExecutionContext class.
arphaman added inline comments.Mon, Oct 16, 9:47 AM
include/clang/Tooling/CommonOptionsParser.h
114

Would it be better to have an `llvm::Error' instead of the two fields?

include/clang/Tooling/Execution.h
48

You don't need to use the llvm:: prefix for StringRef.

77

Why not unique_ptr/shared_ptr? Who owns the results?

135

Standalone is one word.

137

Maybe this class and InMemoryToolResults should be in a separate header?

ioeric updated this revision to Diff 119314.Tue, Oct 17, 6:51 AM
  • Fix broken unit tests when they are run in threads in the same process.
ioeric updated this revision to Diff 119325.Tue, Oct 17, 8:21 AM
ioeric marked 2 inline comments as done.
  • Move StandaloneToolExecutor into a separate header; added ExecutionTest.cpp.
ioeric updated this revision to Diff 119331.Tue, Oct 17, 8:38 AM
ioeric marked an inline comment as done.
  • Addressed review comments.

Thanks for the review!

include/clang/Tooling/CommonOptionsParser.h
114

It would make the code a bit more complicated... we would need to convert between Error and strings a few times here and in the user code.

include/clang/Tooling/Execution.h
77

The context creator owns the results. Here we only store a reference.

137

Moved StandaloneToolExecutor into a separate header. I am keeping InMemoryToolResults here since it might be used by other executors.

ioeric updated this revision to Diff 119337.Tue, Oct 17, 8:52 AM
  • Minor cleanup in tests.
klimek added inline comments.Tue, Oct 17, 9:10 AM
include/clang/Tooling/CommonOptionsParser.h
89–90

If we want error handling, why not make it a static factory that returns an ErrorOr instead?

include/clang/Tooling/Execution.h
27–28

I'd remove this line. Those tend to only get stale without anybody changing them :)

50

I think = default is the new cool way.

52–54

Why do we need to get the results via the interface? For example, in a map/reduce style setup getting the results is infeasible.

77

I think it's weird that this is virtual, but we also have a default implementation that returns something that we require in the constructor. Either make that virtual, and don't put in a default implementation, or make it non-virtual.

91

I don't think we need to annotate every unowned pointer. Pointers are unowned by default, otherwise we'd use unique_ptr.

165–167

I think for clang-refactor we'll need one level of abstraction in the middle:
We'll need to be able to say "run on all translation units referencing symbol X" - how will this fit into the design?
Also, how will code changes generally fit with this design?

ioeric updated this revision to Diff 119447.Wed, Oct 18, 2:32 AM
  • Add a factory method for creating CommonOptionsParser.
  • Minor cleanup in CommonOptionsParser.
  • Revert CommonOptionsParser changes.
  • Merge branch 'option' into arcpatch-D34272
ioeric updated this revision to Diff 119449.Wed, Oct 18, 3:17 AM
ioeric marked 7 inline comments as done.
  • Address review comments.
include/clang/Tooling/CommonOptionsParser.h
89–90

Done. I pulled the changes for CommonOptionsParser into a separate patch (D39042 ). PTAL

include/clang/Tooling/Execution.h
27–28

Right :)

52–54

This would enable tools to access results regardless of the underlying representation of the results.

In a map/reduce style execution, we could have an implementation that deals with fetching and reading remote files; information about remote files can be fed into the implementation from the executor that performs such execution. It might make sense to add an interface that returns the metadata about the underlying data though (e.g. URI for remote files). WDYT?

165–167

I think for clang-refactor we'll need one level of abstraction in the middle:
We'll need to be able to say "run on all translation units referencing symbol X" - how will this fit into the design?

Yes. Firstly, I think clang-refactor is responsible for getting all TUs that reference a symbols and feeding the set of TUs into the tool interface. And then we could implements a "smart" executor that picks an appropriate execution (e.g. delegate to standalone executor or map/reduce style execution) according to the number of TUs.

Also, how will code changes generally fit with this design?

Code changes will be reported by tools/actions via the ToolResults interface as key-value pairs (e.g. source location + serialized AtomicChange). After the execution, callers would have access to the results in ToolResults, or we could also provide an interface that returns the information about the results so that they can be accessed later.

arphaman added inline comments.Wed, Oct 18, 12:54 PM
include/clang/Tooling/StandaloneExecution.h
2

StandaloneExecution.h?

ioeric updated this revision to Diff 119580.Thu, Oct 19, 7:02 AM
  • Remove unused variable.
  • Narrow supported actions to FrontendActionFactory from ToolAction.