This is an archive of the discontinued LLVM Phabricator instance.

[clang][deps] NFC: Split out the module-based API from the TU-based API
ClosedPublic

Authored by jansvoboda11 on Dec 15 2022, 4:20 PM.

Details

Summary

For users of the C++ API, the return type of getFullDependencies doesn't make sense when asking for dependencies of a module. In the returned FullDependenciesResult instance, only DiscoveredModules is useful (the graph of modular dependecies). The FullDeps member is trying to describe a translation unit it was never given. Its command line also refers to a file in the in-memory VFS we create in the scanner, leaking the implementation detail.

This patch splits the API and improves layering and naming of the return types.

Depends on D140175.

Diff Detail

Event Timeline

jansvoboda11 created this revision.Dec 15 2022, 4:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2022, 4:20 PM
Herald added a subscriber: ributzka. · View Herald Transcript
jansvoboda11 requested review of this revision.Dec 15 2022, 4:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2022, 4:20 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
artemcm requested changes to this revision.Dec 16 2022, 11:37 AM
artemcm added inline comments.
clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
112

The existing client of the ModuleName API queries a single module and it's nice to get a structured output of a FullDependenciesResult for the module being queried itself, and get its dependencies from that. Whereas now we will just be getting a sequence of ModuleDepss. So just to confirm, if a client queries getModuleDependencies("Foo", ...), the resulting ModuleDepsGraph will contain a ModuleDeps for Foo itself also, and not just Foo's dependencies?

Still, the client will now need to traverse this graph to find Foo in particular, and it'd be nice to not have to do that. TranslationUnitDeps is not the right thing here, but perhaps something simpler could work, along the lines of:

struct SingleModuleDeps {
  ModuleDeps mainQueriedModule

  /// The graph of direct and transitive modular dependencies of mainQueriedModule
  ModuleDepsGraph ModuleGraph;
};

What do you think?

This revision now requires changes to proceed.Dec 16 2022, 11:37 AM
jansvoboda11 added inline comments.Dec 16 2022, 1:51 PM
clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
112

Yes, ModuleDepsGraph will contain both the dependencies and Foo itself.

Looking at the client code, it doesn't look like it's making any use of the structured nature of FullDependenciesResult, right? The unused FullDependenciesResult::FullDeps field doesn't hold anything useful ATM and FullDependenciesResult::DiscoveredModules is equivalent to the new ModuleDepsGraph. Your client iterates over the whole (flat) dependency graph, creates a cache (map from the module name to its information), and then looks up the queried module in the map.

I'm not sure the suggested return type is buying you much. You probably still need to iterate over the whole graph to create the associative cache. (I guess you might save one insert and find, since we'd give you the root explicitly.) If knowing information on the queried module enables something more interesting, I propose this:

struct SingleModuleDeps {
  ModuleID RootModuleID;
  std::unordered_map<ModuleID, ModuleDeps> ModuleGraph;
};

This gives you the option to look up information on the root module in constant time, but has the advantage of not "splitting" the dependency graph into two parts (the queried module and the rest). I think this might be useful in the future. Let's say your client learns it needs dependencies of a bunch of Clang modules. It could send them all to the Clang scanner at once, which might have some advantages: better parallelism and returning single dependency graph (less allocations and data copies if parts are shared). If we stored ModuleDeps for the queried modules outside of the rest of the ModuleGraph, it becomes hard to model dependencies between the queried modules.

Let me know what you think. I'm happy to do this, but maybe in a follow up patch?

artemcm accepted this revision.Dec 16 2022, 1:56 PM
artemcm added inline comments.
clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
112

I like your proposed SingleModuleDeps design, and think that it'd be useful to both be able to identify the root in constant time and the potential followup use-cases that could benefit from it sound nice too. Doing it in a followup patch sounds just fine. Thanks!

This revision is now accepted and ready to land.Dec 16 2022, 1:56 PM