[Tooling] A new framework for executing clang frontend actions.
ClosedPublic

Authored by ioeric on Jun 16 2017, 2:36 AM.

Details

Summary

This defines a clang::tooling::ToolExecutor interface that can be extended to support different execution plans including standalone execution on a given set of TUs or parallel execution on all TUs in a codebase.

In order to enable multiprocessing execution, tool actions are expected to output result into a ToolResults interface provided by executors. The ToolResults interface abstracts how results are stored e.g. in-memory for standalone executions or on-disk for large-scale execution.

New executors can be registered as ToolExecutorPlugins via the ToolExecutorPluginRegistry. CLI tools can use createExecutorFromCommandLineArgs to create a specific registered executor according to the command-line arguments.

This patch also implements StandaloneToolExecutor which has the same behavior as the current ClangTool interface, i.e. execute frontend actions on a given set of TUs. At this point, it's simply a wrapper around ClangTool at this point.

This is still experimental but expected to replace the existing ClangTool interface so that specific tools would not need to worry about execution.

Diff Detail

Repository
rL LLVM
There are a very large number of changes, so older changes are hidden. Show Older Changes

This is a good start, a few comments. Also I'd suggest running clang-format on this patch -- I saw a few places violate the code style.

include/clang/Tooling/CommonOptionsParser.h
90 ↗(On Diff #118812)

... is set to true for clarify.

109 ↗(On Diff #118812)

return llvm::StringRef?

include/clang/Tooling/Execution.h
21 ↗(On Diff #118898)

Consider moving this comment to ToolExecutor class to make it more discoverable?

47 ↗(On Diff #118898)

How about using std::pair<std::string, std::string> type here? It would make it consistent with the allKVResults below.

Also I think we can define an alias KVResult for this type.

48 ↗(On Diff #118898)

nit: all

81 ↗(On Diff #118898)

nit: execute

lib/Tooling/Execution.cpp
61 ↗(On Diff #118898)

Consider we have two plugins, the first iteration fails, and the second one succeeds, the code looks like still treating it as an error here? And if the case where there are more than one executor being created successfully, we just return the first one.

My understanding of the createExecutorFromCommandLineArgs's behavior is to find a proper ToolExecutorPlugin and create a ToolExecutor instance.

Maybe add a command-line flag like --executor=<my_executor_name> to make it straightforward to choose which registered executor is used, and make StandaloneToolExecutor as default?

73 ↗(On Diff #118898)

nit: Executor

unittests/Tooling/ToolingTest.cpp
32 ↗(On Diff #118898)

Seems that it is unused.

638 ↗(On Diff #118898)

any reason to make it static?

667 ↗(On Diff #118898)

nit: size_t, the same below.

ioeric updated this revision to Diff 119117.Oct 16 2017, 1:51 AM
  • clang-format code.
ioeric updated this revision to Diff 119122.Oct 16 2017, 2:39 AM
ioeric marked 5 inline comments as done.
  • Address review comments.

Thanks for the review!

include/clang/Tooling/CommonOptionsParser.h
109 ↗(On Diff #118812)

I'm not a big fan of StringRef as a return value. Callers can simply use StringRef s = parser.getErrorMessage(), if they want to; otherwise, they would need to worry about the live time of s in auto s = parser.getErrorMessage(), if the interface returns a StringRef.

include/clang/Tooling/Execution.h
47 ↗(On Diff #118898)

It would make the the interface a bit cumbersome to use - callers would need to make pairs all the time. And converting to pairs is an unnecessary performance overhead IMO. The consistency with AllKVResults is not that important after all.

We don't have a lot usages of std::pair<std::string, std::string> yet. I'll add an alias when needed. Currently, I think it makes sense to keep the result type explicit.

48 ↗(On Diff #118898)

I think for-each is a more canonical name? E.g. we have for_each loop in C++.

lib/Tooling/Execution.cpp
61 ↗(On Diff #118898)

Yes, we simply return the first plugin that matches.

It's hard to ensure that all executors have mutually exclusive arguments when more executors are added; it should be up to the executors to define unique (or as-unique-as-possible) arguments. For example, a map-reduce-based executor could require a "--mr" option to be set. The factory should simply return the first one that matches.

The point of this helper function is to hide the dispatching of different executors, so I don't think requiring users to specify the executor name is sensible. For example, users should only care about whether to set --mr instead of having to know the existence of different executors.

unittests/Tooling/ToolingTest.cpp
638 ↗(On Diff #118898)

Added a comment explaining this.

667 ↗(On Diff #118898)

createExecutorFromCommandLineArgs and, more importantly, the underlying library functions take int, so... :(

ioeric updated this revision to Diff 119150.Oct 16 2017, 6:44 AM
  • Add a ExecutionContext class.
arphaman added inline comments.Oct 16 2017, 9:47 AM
include/clang/Tooling/CommonOptionsParser.h
116 ↗(On Diff #119150)

Would it be better to have an `llvm::Error' instead of the two fields?

include/clang/Tooling/Execution.h
47 ↗(On Diff #119150)

You don't need to use the llvm:: prefix for StringRef.

76 ↗(On Diff #119150)

Why not unique_ptr/shared_ptr? Who owns the results?

134 ↗(On Diff #119150)

Standalone is one word.

136 ↗(On Diff #119150)

Maybe this class and InMemoryToolResults should be in a separate header?

ioeric updated this revision to Diff 119314.Oct 17 2017, 6:51 AM
  • Fix broken unit tests when they are run in threads in the same process.
ioeric updated this revision to Diff 119325.Oct 17 2017, 8:21 AM
ioeric marked 2 inline comments as done.
  • Move StandaloneToolExecutor into a separate header; added ExecutionTest.cpp.
ioeric updated this revision to Diff 119331.Oct 17 2017, 8:38 AM
ioeric marked an inline comment as done.
  • Addressed review comments.

Thanks for the review!

include/clang/Tooling/CommonOptionsParser.h
116 ↗(On Diff #119150)

It would make the code a bit more complicated... we would need to convert between Error and strings a few times here and in the user code.

include/clang/Tooling/Execution.h
76 ↗(On Diff #119150)

The context creator owns the results. Here we only store a reference.

136 ↗(On Diff #119150)

Moved StandaloneToolExecutor into a separate header. I am keeping InMemoryToolResults here since it might be used by other executors.

ioeric updated this revision to Diff 119337.Oct 17 2017, 8:52 AM
  • Minor cleanup in tests.
klimek added inline comments.Oct 17 2017, 9:10 AM
include/clang/Tooling/CommonOptionsParser.h
90 ↗(On Diff #118812)

If we want error handling, why not make it a static factory that returns an ErrorOr instead?

include/clang/Tooling/Execution.h
26–27 ↗(On Diff #119337)

I'd remove this line. Those tend to only get stale without anybody changing them :)

49 ↗(On Diff #119337)

I think = default is the new cool way.

51–53 ↗(On Diff #119337)

Why do we need to get the results via the interface? For example, in a map/reduce style setup getting the results is infeasible.

76 ↗(On Diff #119337)

I think it's weird that this is virtual, but we also have a default implementation that returns something that we require in the constructor. Either make that virtual, and don't put in a default implementation, or make it non-virtual.

90 ↗(On Diff #119337)

I don't think we need to annotate every unowned pointer. Pointers are unowned by default, otherwise we'd use unique_ptr.

164–166 ↗(On Diff #119337)

I think for clang-refactor we'll need one level of abstraction in the middle:
We'll need to be able to say "run on all translation units referencing symbol X" - how will this fit into the design?
Also, how will code changes generally fit with this design?

ioeric updated this revision to Diff 119447.Oct 18 2017, 2:32 AM
  • Add a factory method for creating CommonOptionsParser.
  • Minor cleanup in CommonOptionsParser.
  • Revert CommonOptionsParser changes.
  • Merge branch 'option' into arcpatch-D34272
ioeric updated this revision to Diff 119449.Oct 18 2017, 3:17 AM
ioeric marked 7 inline comments as done.
  • Address review comments.
include/clang/Tooling/CommonOptionsParser.h
90 ↗(On Diff #118812)

Done. I pulled the changes for CommonOptionsParser into a separate patch (D39042 ). PTAL

include/clang/Tooling/Execution.h
26–27 ↗(On Diff #119337)

Right :)

51–53 ↗(On Diff #119337)

This would enable tools to access results regardless of the underlying representation of the results.

In a map/reduce style execution, we could have an implementation that deals with fetching and reading remote files; information about remote files can be fed into the implementation from the executor that performs such execution. It might make sense to add an interface that returns the metadata about the underlying data though (e.g. URI for remote files). WDYT?

164–166 ↗(On Diff #119337)

I think for clang-refactor we'll need one level of abstraction in the middle:
We'll need to be able to say "run on all translation units referencing symbol X" - how will this fit into the design?

Yes. Firstly, I think clang-refactor is responsible for getting all TUs that reference a symbols and feeding the set of TUs into the tool interface. And then we could implements a "smart" executor that picks an appropriate execution (e.g. delegate to standalone executor or map/reduce style execution) according to the number of TUs.

Also, how will code changes generally fit with this design?

Code changes will be reported by tools/actions via the ToolResults interface as key-value pairs (e.g. source location + serialized AtomicChange). After the execution, callers would have access to the results in ToolResults, or we could also provide an interface that returns the information about the results so that they can be accessed later.

arphaman added inline comments.Oct 18 2017, 12:54 PM
include/clang/Tooling/StandaloneExecution.h
1 ↗(On Diff #119449)

StandaloneExecution.h?

ioeric updated this revision to Diff 119580.Oct 19 2017, 7:02 AM
  • Remove unused variable.
  • Narrow supported actions to FrontendActionFactory from ToolAction.
ioeric updated this revision to Diff 120068.Oct 24 2017, 6:49 AM
  • Experimented and refined the interfaces. o Get rid of ExecutionConfig class; pass actions into executors directly. o Add ArgumentAdjuster to ExecutionContext. o Make command-line based executor selection more explicit by requiring a "--executor" option.
ioeric marked an inline comment as done.Oct 24 2017, 6:56 AM

PTAL

I experimented with our internal tools (incl. mapreduce-based execution) and modified the interfaces a bit:
o Get rid of ExecutionConfig class; pass actions into executors directly.
o Moved ArgumentAdjusters that are common to all actions into ExecutionContext.
o Make command-line based executor selection more explicit by requiring an --executor option.

ioeric updated this revision to Diff 120232.Oct 25 2017, 5:06 AM
  • Get rid of the unused 'createExecutorByName' interface.
ioeric updated this revision to Diff 120250.Oct 25 2017, 7:40 AM
  • Only expose result reporting interface from ExecutionContext so that callbacks don't have access to results.
ioeric updated this revision to Diff 120254.Oct 25 2017, 7:53 AM
  • Minor comment update.
ioeric added inline comments.Oct 25 2017, 7:53 AM
include/clang/Tooling/Execution.h
51–53 ↗(On Diff #119337)

After offline discussion, I now understand the concern about having tool callbacks be able to access results.

I've changed the ExecutionContext to only expose a result reporting interface instead of returning a reference of ToolResults - callbacks are expected to only have access to ExecutionContext. The ToolResults would only be available from the executor itself.

klimek added inline comments.Oct 25 2017, 8:09 AM
include/clang/Tooling/Execution.h
76 ↗(On Diff #120254)

I think the argument adjuster adjustment shouldn't be part of this interface, as the argument adjusters cannot be changed in the phase in which we want the ExecutionContext to be used.

I'd just make the argument adjusters a parameter on the constructor here (or alternatively, do not couple them in here, and just hand them around as a separate entity).

130–131 ↗(On Diff #120254)

I'd put the ArgumentAdjust second (as the action is the primary thing being acted on).

ioeric updated this revision to Diff 120370.Oct 26 2017, 1:28 AM
ioeric marked 2 inline comments as done.
  • Addressed review comments o Removed argument adjusting interfaces from ExecutionContext o Switched positions of Action and Adjuster parameters in the execute interfaces
ioeric added inline comments.Oct 26 2017, 1:28 AM
include/clang/Tooling/Execution.h
76 ↗(On Diff #120254)

You are right. After a second thought, the adjuster here is actually only used by the executor itself (e.g. default args), and executors can maintain their own adjusters internally, so it doesn't need to be in the interface.

I've removed the adjuster interfaces from ExecutionContext. Action-specific adjusters can still be passed in through the ToolExecutor::execute(...) interface.

This revision is now accepted and ready to land.Oct 26 2017, 1:55 AM
This revision was automatically updated to reflect the committed changes.

Hello, I'm afraid this commit is failing on Windows.

http://lab.llvm.org:8011/console?builder=llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast&builder=llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast

FAIL: Clang-Unit :: Tooling/./ToolingTests.exe/StandaloneToolTest.SimpleActionWithResult (14523 of 35936)

  • TEST 'Clang-Unit :: Tooling/./ToolingTests.exe/StandaloneToolTest.SimpleActionWithResult' FAILED ****

Note: Google Test filter = StandaloneToolTest.SimpleActionWithResult

[==========] Running 1 test from 1 test case.

[----------] Global test environment set-up.

[----------] 1 test from StandaloneToolTest

[ RUN ] StandaloneToolTest.SimpleActionWithResult

C:\ps4-buildslave2\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\tools\clang\unittests\Tooling\ExecutionTest.cpp(217): error: Value of: !Err

Actual: false

Expected: true

error: error reading 'C:\a.cc'

1 error generated.

Error while processing C:\a.cc.

Program aborted due to an unhandled Error:

Failed to run action.LLVMSymbolizer: error reading file: PDB Error: Unable to load PDB. Make sure the file exists and is readable. Calling loadDataForExe

#0 0x0114e867 (C:\ps4-buildslave2\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.obj\tools\clang\unittests\Tooling\ToolingTests.exe+0x45e867)

#1 0x76805422 (C:\WINDOWS\System32\ucrtbase.dll+0xa5422)

#2 0x00d67dec (C:\ps4-buildslave2\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.obj\tools\clang\unittests\Tooling\ToolingTests.exe+0x77dec)

#3 0x011820a4 (C:\ps4-buildslave2\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.obj\tools\clang\unittests\Tooling\ToolingTests.exe+0x4920a4)


PASS: Extra Tools Unit Tests :: change-namespace/./ChangeNamespaceTests.exe/ChangeNamespaceTest.SimpleMoveEnum (14524 of 35936)
PASS: Extra Tools Unit Tests :: change-namespace/./ChangeNamespaceTests.exe/ChangeNamespaceTest.DontMoveForwardDeclarationInClass (14525 of 35936)
FAIL: Clang-Unit :: Tooling/./ToolingTests.exe/StandaloneToolTest.SimpleAction (14526 of 35936)

  • TEST 'Clang-Unit :: Tooling/./ToolingTests.exe/StandaloneToolTest.SimpleAction' FAILED ****

Note: Google Test filter = StandaloneToolTest.SimpleAction

[==========] Running 1 test from 1 test case.

[----------] Global test environment set-up.

[----------] 1 test from StandaloneToolTest

[ RUN ] StandaloneToolTest.SimpleAction

C:\ps4-buildslave2\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\tools\clang\unittests\Tooling\ExecutionTest.cpp(204): error: Value of: !Err

Actual: false

Expected: true

error: error reading 'C:\a.cc'

1 error generated.

Error while processing C:\a.cc.

Program aborted due to an unhandled Error:

Failed to run action.LLVMSymbolizer: error reading file: PDB Error: Unable to load PDB. Make sure the file exists and is readable. Calling loadDataForExe

#0 0x0114e867 (C:\ps4-buildslave2\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.obj\tools\clang\unittests\Tooling\ToolingTests.exe+0x45e867)

#1 0x76805422 (C:\WINDOWS\System32\ucrtbase.dll+0xa5422)

#2 0x00d67dec (C:\ps4-buildslave2\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.obj\tools\clang\unittests\Tooling\ToolingTests.exe+0x77dec)

#3 0x011820a4 (C:\ps4-buildslave2\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.obj\tools\clang\unittests\Tooling\ToolingTests.exe+0x4920a4)


bjope added a subscriber: bjope.Oct 26 2017, 5:50 AM

I get errors when compiling due to this commit:

FAILED: tools/clang/lib/Tooling/CMakeFiles/clangTooling.dir/StandaloneExecution.cpp.o 
/repo/app/clang/3.6/bin/clang++  -march=corei7  -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/clang/lib/Tooling -I../tools/clang/lib/Tooling -I../tools/clang/include -Itools/clang/include -Iinclude -I../include -I/repo/app/valgrind/3.11.0/include  -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fcolor-diagnostics -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3    -UNDEBUG  -fno-exceptions -fno-rtti -MMD -MT tools/clang/lib/Tooling/CMakeFiles/clangTooling.dir/StandaloneExecution.cpp.o -MF tools/clang/lib/Tooling/CMakeFiles/clangTooling.dir/StandaloneExecution.cpp.o.d -o tools/clang/lib/Tooling/CMakeFiles/clangTooling.dir/StandaloneExecution.cpp.o -c ../tools/clang/lib/Tooling/StandaloneExecution.cpp
../tools/clang/lib/Tooling/StandaloneExecution.cpp:65:11: error: unused variable 'Ret' [-Werror,-Wunused-variable]
  if (int Ret = Tool.run(Action.first.get()))
          ^
1 error generated.
jklaehn added a subscriber: jklaehn.Nov 2 2017, 6:52 AM