This is an archive of the discontinued LLVM Phabricator instance.

[clang][deps] Split translation units into individual -cc1 or other commands
ClosedPublic

Authored by benlangmuir on Aug 22 2022, 11:57 AM.

Details

Summary

Instead of trying to "fix" the original driver invocation by appending arguments to it, split it into multiple commands, and for each -cc1 command use a CompilerInvocation to give precise control over the invocation.

This change should make it easier to (in the future) canonicalize the command-line (e.g. to improve hits in something like ccache), apply optimizations, or start supporting multi-arch builds, which would require different modules for each arch.

In the long run it may make sense to treat the TU commands as a dependency graph, each with their own dependencies on modules or earlier TU commands, but for now they are simply a list that is executed in order, and the dependencies are simply duplicated. Since we currently only support single-arch builds, there is no parallelism available in the execution.

Diff Detail

Event Timeline

benlangmuir created this revision.Aug 22 2022, 11:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2022, 11:57 AM
benlangmuir requested review of this revision.Aug 22 2022, 11:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2022, 11:57 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
benlangmuir added inline comments.Aug 22 2022, 12:02 PM
clang/test/ClangScanDeps/diagnostics.c
31

I didn't want to radically change all the tests, so in most cases I just loosened the matching so it would match inside the "commands". The modules-full.c and multiple-commands.c cover the whole structure in detail.

tschuett added inline comments.
clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
43

Why is this not an enum class?

79

Why are all members public?

jansvoboda11 added inline comments.Aug 22 2022, 12:59 PM
clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
33

Have you considered using the Job/Command classes the driver uses? What are the downsides of doing that?

clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
45

Is clang:: necessary here?

112

Maybe we could add this API directly to CompilerInvocation. I can see lots of clients not caring about the (potentially more efficient) CompilerInvocation::generateCC1CommandLine() with StringAllocator and inventing their own implementation of this.

clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
154

Shouldn't this be a responsibility of the dependency scanner?

196

Just a heads-up: after I land D132066 today, you might need to update this to respect the eager/lazy loading mode.

clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
476

Wondering if we could move this to CompilerInvocation, right besides resetNonModularOptions().

clang/test/ClangScanDeps/diagnostics.c
31

That sounds good to me.

benlangmuir added inline comments.Aug 22 2022, 1:39 PM
clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
33

The driver::Command/driver::JobList is not standalone from the driver - it depends on an external const Action &, const Tool &, and a few const char *, so if we used it we would also need to preserve the Compilation and Driver they came from somewhere, or change the API so the commands were only available inside a callback with limited scope that they are immediately copied out of. I'm also not sure what the consequences of mutating Commands are (there is a replaceArguments, but I don't know if it could break invariants).

clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
154

Good question! I was mirroring how it works for modules where we do this in the ModuleDepCollector and store the whole invocation in the ModuleDeps. But I guess that's "inside" the dep scanner, and this is "outside". Happy to shuffle things around.

What is your opinion of takeFullDependencies() adding to FrontendOpts.ModuleFiles, ? That is also arguably the scanner's responsibility.

Another thing we could do here is remove BuildInvocation from FullDependencies and ModuleDeps and only expose the serialized -cc1 arguments. It would simplify the new code a bit since there'd only be one kind of "command". On the other hand, I could also see it being potentially useful to have the whole invocation available if you were writing a C++ tool that used the scanner API rather than just clang-scan-deps itself.

benlangmuir added inline comments.Aug 22 2022, 3:53 PM
clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
112
clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
476
jansvoboda11 added inline comments.Aug 23 2022, 12:23 PM
clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
33

Makes sense, let's keep this as-is then.

clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
154

Good question! I was mirroring how it works for modules where we do this in the ModuleDepCollector and store the whole invocation in the ModuleDeps. But I guess that's "inside" the dep scanner, and this is "outside". Happy to shuffle things around.

As discussed offline, I'd prefer keeping CompilerInvocation internal to the dependency scanner and exposing simple types (std::vector<std::string> for the command line) unless clients need more flexibility.

What is your opinion of takeFullDependencies() adding to FrontendOpts.ModuleFiles, ? That is also arguably the scanner's responsibility.

I'd prefer adding to FrontendOpts.ModuleFiles earlier, before the consumer.

Another thing we could do here is remove BuildInvocation from FullDependencies and ModuleDeps and only expose the serialized -cc1 arguments. It would simplify the new code a bit since there'd only be one kind of "command". On the other hand, I could also see it being potentially useful to have the whole invocation available if you were writing a C++ tool that used the scanner API rather than just clang-scan-deps itself.

+1

  • Remove CompilerInvocation from Command and ModuleDeps. Only the arg strings are exposed now.
  • Simplify Command, which is now just a simple struct.
  • Move the logic for mutating the CompilerInvocation for the TU into the scanner.
  • Minor refactoring to reduce nesting in DependencyScanningWorker.
  • Rebase
benlangmuir marked 11 inline comments as done.Aug 24 2022, 12:10 PM
benlangmuir added inline comments.
clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
43

Removed in latest change.

79

Latest change simplifies this to a struct.

clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
154

Done in latest changes. The logic for mutating the compiler invocation is now contained in the ModuleDepCollector, alongside the existing logic for doing the same kind of changes to module build commands.

I left a couple of smaller initial comments. I'd like to see this split into multiple patches. I can see some formatting changes, removal of CompilerInvocation from ModuleDeps, isolated changes to Tooling, etc. That'd make it much easier to review.

clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
196

Nit: handle*() functions are associated with DependencyConsumer in my brain. Could we find a distinct name to avoid confusion between all the different types we're using here?

212

I assume you're not using llvm::DenseMap<ModuleID, ModuleDeps *> because it's a hassle to implement for custom keys and performance is not a concern, correct? Any other aspects?

214

Would this allow us to remove ModuleDepCollectorPP::DirectPrebuiltModularDeps?

clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
343

As said in a previous comment, we might avoid the copy by storing this in MDC in the first place.

452

Nit: this function might get easier to think about if we did this in one step with the context hash calculation:

MDC.associateWithContextHash(MD, CI);
MDC.addOutputPaths(MD, CI);
MD.BuildArguments = CI.getCC1CommandLine();

I'd like to see this split into multiple patches. I can see some formatting changes, removal of CompilerInvocation from ModuleDeps, isolated changes to Tooling, etc. That'd make it much easier to review.

ToolInvocation change: https://reviews.llvm.org/D132615
Remove CompilerInvocation from ModuleDeps: https://reviews.llvm.org/D132616
Factoring out the addModule*Files functions, sink DirectPrebuiltModularDeps, etc: https://reviews.llvm.org/D132617

benlangmuir added inline comments.Aug 24 2022, 3:52 PM
clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
212

No good reason. I switched it to DenseMap and included the change in https://reviews.llvm.org/D132617

214

I sunk it to MDC in https://reviews.llvm.org/D132617. It needed to change to MapVector since we are also uniquing them.

clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
343

We don't save any copies, since we need to actually store the PrebuiltModuleDep in MDC in order to apply it to a secondary CompilerInvocation after the preprocessor etc. is gone. But I still like sinking it down.

452

Good idea, included in https://reviews.llvm.org/D132617

  • Rebased, picking up the changes that were previously split out.
  • Broke up applying changes to CI from passing to consumer. The MDC is no longer responsible for passing invocations to the consumer, and it is done at the Worker level.
benlangmuir added inline comments.Aug 25 2022, 11:33 AM
clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
196

I split up applying changes to the CI from passing to the consumer. Hopefully it's a little simpler now.

clang/test/ClangScanDeps/modules-implicit-dot-private.m
70

Calling out this change so it doesn't disappear into the other test changes. The order changed because in the driver version of this we were appending -fmodule-file= options in the order of dependencies, but in the new code we insert into PrebuiltModuleFiles which is a std::map, so the entries are alphabetically sorted.

jansvoboda11 added inline comments.Aug 25 2022, 1:00 PM
clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
27

Not needed anymore, I assume.

43

Would it make sense to accept the clang::tooling::dependencies::Command struct here?

clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
165

This makes sure we only run scan once per driver invocation? Can you expand on this a bit? Maybe even put the reasoning into a comment in the code.

455

I'm not particularly fond of the fact that Consumer.handleBuildCommand() is called in this lambda directly in the non-clang case, but several objects deep in the normal case (ToolInvocation -> DependencyScanningAction). I think a clearer way to do this would be to somehow extract the CompilerInvocation (or Command) result from ToolInvocation and report it in this lambda too.

clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
177

I think this could be useful for other tools too in the future. Do you think it would make sense to put this in a more prominent header, so that other people can find it and reuse it more easily?

clang/test/ClangScanDeps/deprecated-driver-api.c
2

Please summarize what this test does.

clang/test/ClangScanDeps/multiple-commands.c
2

Please summarize what this test does.

  • handleBuildCommand now takes a Command
  • Removed now-unused forward decl
  • Expanded comment about scan-once behaviour
  • Moved all calls to handleBuildCommand to a single place
  • Added comments to new tests
benlangmuir marked 6 inline comments as done.Aug 25 2022, 2:20 PM
benlangmuir added inline comments.
clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
165

In theory you want to scan once for each independent chain of -cc1 commands, but since we don't yet support multi-arch builds in the scanner that just means scan once per driver invocation.

I'll add a comment.

455

Yeah, this area has gone through a lot of churn as I try to balance the desire to keep it clear where the consumer should be called vs. trying to keep the MDC contained to the action.

I took another crack at it in the latest diff.

clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
177

I would prefer not to expose this without more understanding of what other use cases there are. It seems like there are many ways to interpret "needsModules" -- most of the time you probably want something more like LangOptions::Modules.

jansvoboda11 accepted this revision.Aug 30 2022, 1:29 PM

LGTM. Up to you if you act on the last comment. Thanks for seeing this through!

clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
295

I'm not a fan of storing MDC as a member, but I don't see any clearly better alternatives. One option would be to pass CompilerInvocation out-parameter to ModuleDepCollector's constructor and relying on the collector to apply discovered dependencies, but that's not super obvious.

This revision is now accepted and ready to land.Aug 30 2022, 1:29 PM
benlangmuir marked 2 inline comments as done.Aug 30 2022, 2:44 PM
benlangmuir added inline comments.
clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
295

We are applying changes to multiple CompilerInvocations (the scanned one, but also any later ones that are downstream of it), so it's less obvious to me how to do that here. We would need to pass in all of the invocations, but currently only one of them is in-memory at a time. I'm open to other ideas here, but I haven't come up with any great ideas. Maybe there's a refactoring of MDC that splits the scanning state from the state necessary to apply changes to the CI so we would only expose the minimum information needed, but I expect that would be an invasive change so prefer not to attempt it in this patch.

This revision was landed with ongoing or failed builds.Aug 30 2022, 3:44 PM
This revision was automatically updated to reflect the committed changes.