This is an archive of the discontinued LLVM Phabricator instance.

[clang-scan-deps] initial outline of the tool that runs preprocessor to find dependencies over a JSON compilation database

Authored by arphaman on Apr 3 2019, 2:48 PM.



This patch introduces an outline for the clang-scan-deps tool that will be used to implement fast dependency discovery phase using implicit modules for explicit module builds.

The initial version of the tool works by computing non-modular header dependencies for files in the compilation database without any optimizations (i.e. without source minimization from The tool spawns a number of worker threads to run the clang compiler workers in parallel.

The immediate goal for clang-scan-deps is to create a ClangScanDeps library which will be used to build up this tool to use the source minimization and caching multi-threaded filesystem to implement the optimized non-incremental dependency scanning phase for a non-modular build. This will allow us to do benchmarks and comparisons for performance that the minimization and caching give us, and maybe setup CI to track that performance. @Bigcheese will then extend the tool to have the support for computing modular dependencies for Clang modules. The tool could possibly interact with build systems in the future if desired, but it's not our immediate goal.

This tool is based on code that was included in the original WIP patch I posted before the dev meeting last October: .

Diff Detail


Event Timeline

arphaman created this revision.Apr 3 2019, 2:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2019, 2:48 PM
ormris added inline comments.Apr 4 2019, 10:45 AM
14 ↗(On Diff #193600)

Quick drive-by nit: This test won't work under Windows due to the path separator. To account for both platforms, you could use a pattern like "{{/|\\}}".

aganea added a subscriber: aganea.Apr 9 2019, 6:56 AM

A comment by whisperity in the origin WIP did not seem to be addressed.
Have you looked at IWYU ?
The end goals of clang-scan-deps and iwyu seem similar so their implementation problems would also be similar.
The iwyu problems doc is

A comment by whisperity in the origin WIP did not seem to be addressed.
Have you looked at IWYU ?
The end goals of clang-scan-deps and iwyu seem similar so their implementation problems would also be similar.
The iwyu problems doc is

IWYU doesn't seem related to me. My understanding is:

  • IWYU checks that semantic dependencies between source files is reflected by the file dependencies (e.g., semantic dependencies should have matching #includes).
  • clang-scan-deps discovers the transitive file dependencies of a TU.

IIUC, IWYU's implementation challenges are related to mapping/discovering semantic dependencies between source files, which is something that clang-scan-deps doesn't care about at all. Can you clarify what link you see?

jkorous added inline comments.Apr 16 2019, 3:50 PM
39 ↗(On Diff #193600)

Files -> FileMgr might be more readable.

56 ↗(On Diff #193600)

Success -> Result?

91 ↗(On Diff #193600)

Maybe it's obvious but maybe we could add some mention of expected lifetime of Compilations argument.

112 ↗(On Diff #193600)

Missing "as"?

124 ↗(On Diff #193600)

We might mention the default value (llvm::hardware_concurrency()).

203 ↗(On Diff #193600)

Shouldn't this be |=? This seems like we're returning only the result of the last run.

arphaman updated this revision to Diff 203012.Jun 4 2019, 1:33 PM
arphaman marked 7 inline comments as done.

Address review comments.

aganea added inline comments.Jun 7 2019, 7:28 AM
4 ↗(On Diff #203012)

Is -MD -MF required for clang-scan-deps to work? Can't we append those arguments automatically, so that only include paths are needed in the CDB?

Would it possible for the tool to produce a collated file? A real-world usage would otherwise produce tens of thousands of files. Which then would be opened/parsed by the calling application.

8 ↗(On Diff #203012)

Add a second test to excercice multi-threading? (by providing a few more files in the CDB)

140 ↗(On Diff #203012)

InitLLVM X(argc, argv); is missing here to print callstacks in case of a crash, like the other tools.

arphaman updated this revision to Diff 203887.Jun 10 2019, 1:23 PM
arphaman marked 3 inline comments as done.
  • Add test for -j2 to exercise multi threading.
  • Add InitLLVM.
4 ↗(On Diff #203012)

I agree, the tool shouldn't rely on those options being there. -MD -MF in this test right now a crutch to get the dependencies printed somewhere.

I think the tool by default should print out the collated dependencies to STDOUT instead of writing them to files that were specified for the build. The -MD -MF options will not be required as well. This implementation requires some changes in the dependency collector, which I am planning to do as part of clang-scan-deps in the follow-up patches. Will it be ok to keep this patch as it is right now, and transition to printing out the collated dependencies without requiring the options as a follow-up?

aganea accepted this revision.Jun 10 2019, 2:46 PM
aganea added inline comments.
4 ↗(On Diff #203012)

Sounds good! Please cc me on the upcoming reviews, we have a good use-case for clang-scan-deps in Fastbuild!

3 ↗(On Diff #203887)

Two space indent.

This revision is now accepted and ready to land.Jun 10 2019, 2:46 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2019, 2:30 PM
arphaman marked 2 inline comments as done.Jun 12 2019, 2:34 PM
arphaman added inline comments.
4 ↗(On Diff #203012)

Sure, I will CC you on the upcoming patches.

3 ↗(On Diff #203887)

Oops, I missed this comment. Fixed in r363205.