This is an archive of the discontinued LLVM Phabricator instance.

[clang][deps] Make resource directory deduction configurable
ClosedPublic

Authored by jansvoboda11 on Aug 19 2021, 4:53 AM.

Details

Summary

The clang-scan-deps CLI tool invokes the compiler with -print-resource-dir in case the -resource-dir argument is missing from the compilation command line. This is to enable running the tool on compilation databases that use compiler from a different toolchain than clang-scan-deps itself. While this doesn't make sense when scanning modular builds (due to the -cc1 arguments the tool generates), the tool can can be used to efficiently scan for file dependencies of non-modular builds too.

This patch stops deducing the resource directory by invoking the compiler by default. This mode can still be enabled by invoking clang-scan-deps with --resource-dir-recipe invoke-compiler. The new default is --resource-dir-recipe modify-compiler-path which relies on the resource directory deduction taking place in Driver::Driver which is based on the compiler path. This makes the default more aligned with the intended usage of the tool while still allowing it to serve other use-cases.

Note that this functionality was also influenced by D108979, where the dependency scanner stopped going through ClangTool::run. The function tried to deduce the resource directory based on the current executable path, which might not be what the users expect when invoked from within a shared library.

Depends on D108979.

Diff Detail

Event Timeline

jansvoboda11 requested review of this revision.Aug 19 2021, 4:53 AM
jansvoboda11 created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2021, 4:53 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
jansvoboda11 edited the summary of this revision. (Show Details)Aug 19 2021, 4:54 AM

Tagging @kousikk, since this is related to D69122 that introduced ResourceDirectoryCache to clang-scan-deps. When the compilation command doesn't have a -resource-dir argument, ResourceDirectoryCache invokes the specified compiler with -print-resource-dir and injects the result into the command-line as -resource-dir.

This happens way before the dependency scanner worker is invoked, meaning the logic this patch tweaks won't usually kick in. The test passes only because the invocation of /our/custom/bin/clang -print-resource-dir made by ResourceDirectoryCache silently fails (the binary doesn't exist), allowing the worker to deduce the resource directory using regular driver logic.

I think both clang-scan-deps and the downstream libclang API clearly head towards only supporting compilers built the same way (same version, architecture, etc.). The modular dependency scanner already returns command-lines of cc1 arguments that are not stable across Clang versions.

I wanted to see if we can reach consensus on removing ResourceDirectoryCache entirely. It makes the resource directory deduction much more lightweight and is in line with the direction we're already going regarding compiler compatibility. It also allows clang-scan-deps and libclang API to have the same behavior, which is a desirable property IMO. If users really want the behavior of ResourceDirectoryCache, they can keep using prior versions clang-scan-deps.

What do you all think?

Account for Windows backslashes

The patch seems mostly straightforward, but can you clarify whether there was a functionality change for clang-scan-deps? My reading of the code suggests not, because it was using ResourceDirectoryCache before and still will... in which case, can you walk me through how the test covers this code change? (Maybe some comments in the test would help.)

Tagging @kousikk, since this is related to D69122 that introduced ResourceDirectoryCache to clang-scan-deps. When the compilation command doesn't have a -resource-dir argument, ResourceDirectoryCache invokes the specified compiler with -print-resource-dir and injects the result into the command-line as -resource-dir.

This happens way before the dependency scanner worker is invoked, meaning the logic this patch tweaks won't usually kick in. The test passes only because the invocation of /our/custom/bin/clang -print-resource-dir made by ResourceDirectoryCache silently fails (the binary doesn't exist), allowing the worker to deduce the resource directory using regular driver logic.

I think both clang-scan-deps and the downstream libclang API clearly head towards only supporting compilers built the same way (same version, architecture, etc.). The modular dependency scanner already returns command-lines of cc1 arguments that are not stable across Clang versions.

I wanted to see if we can reach consensus on removing ResourceDirectoryCache entirely. It makes the resource directory deduction much more lightweight and is in line with the direction we're already going regarding compiler compatibility. It also allows clang-scan-deps and libclang API to have the same behavior, which is a desirable property IMO. If users really want the behavior of ResourceDirectoryCache, they can keep using prior versions clang-scan-deps.

What do you all think?

Jan and I already talked offline, but FTR, I agree with the direction of expecting the "dependency scanning" APIs to run from a libclang/clang-scan-deps that's installed alongside clang itself. This is required for the dependency scanning to be correct anyway (since different compilers could have different pre-defined macros).

The patch seems mostly straightforward, but can you clarify whether there was a functionality change for clang-scan-deps? My reading of the code suggests not, because it was using ResourceDirectoryCache before and still will... in which case, can you walk me through how the test covers this code change? (Maybe some comments in the test would help.)

For clang-scan-deps, this is almost NFC. This code kicks in iff ResourceDirectoryCache doesn't provide any result:

  • the execution of CommandLine[0] -print-resource-dir command returns a non-zero exit code (exercised in the test) or doesn't print anything,
  • the CommandLine is empty,
  • the compiler executable (CommandLine[0]) is not an absolute path.

That's why I want to check with people if we can agree on removing ResourceDirectoryCache. I'd be keen on updating this patch to include the change and make the test more obviously correct.

dexonsmith requested changes to this revision.Aug 23 2021, 9:49 AM

For clang-scan-deps, this is almost NFC. This code kicks in iff ResourceDirectoryCache doesn't provide any result:

  • the execution of CommandLine[0] -print-resource-dir command returns a non-zero exit code (exercised in the test) or doesn't print anything,
  • the CommandLine is empty,
  • the compiler executable (CommandLine[0]) is not an absolute path.

That's why I want to check with people if we can agree on removing ResourceDirectoryCache. I'd be keen on updating this patch to include the change and make the test more obviously correct.

Got it.

I wonder if it'd be better to split "changing default behaviour of clang-scan-deps the tool" from "avoid inject-resource-dir logic when it's incorrect". I.e., commit something in this patch to fix the immediate problem and remove/optimize ResourceDirectoryCache in a follow-up.

For this patch, you could add an inject-resource-dir flag to the scanning service, which controls the new Clang::Tooling flag (and can be controlled by libclang), and a command-line flag in clang-scan-deps to allow testing both sides of it. Would be worth documenting the testcase with a good comment to explain the corner case. Maybe the command-line flag would also disable ResourceDirectoryCache.

clang/test/ClangScanDeps/resource_directory.c
3

I don't love this hardcoded path to clang which could exist on some machine.

Could you use /dev/null for the path to clang?
Or /dev/null/bin/clang?

This revision now requires changes to proceed.Aug 23 2021, 9:49 AM

Rebase on top of D108979

jansvoboda11 retitled this revision from [clang][deps] Deduce resource directory from the compiler path to [clang][deps] Make resource directory deduction configurable.Sep 6 2021, 1:33 AM
jansvoboda11 edited the summary of this revision. (Show Details)

@dexonsmith I have incorporated your suggestions.

Attempt to fix tests in CI

Attempt to fix CI again

Fix CI pls?

jansvoboda11 added inline comments.Sep 9 2021, 8:46 AM
clang/test/ClangScanDeps/resource_directory.c
3

This is not a concern anymore. It used to be, since the deduction of resource directory based on the compiler path was only kicking in if the -print-resource-dir invocation failed (e.g. the compiler executable didn't exist). Since we're now explicitly passing --resource-dir-recipe modify-compiler-path, it doesn't matter if the compiler executable exists on the filesystem or not. We're always going to do simple string manipulation in this case.

For the other test-case (--resource-dir-recipe invoke-compiler), we're actually using an executable that does exist on the filesystem.

dexonsmith accepted this revision.Sep 9 2021, 8:49 AM

LGTM.

clang/test/ClangScanDeps/resource_directory.c
3

Okay. As long as existence on disk of /custom/compiler/resources and/or /our/custom/clang won't change behaviour of the test, I think the test is fine.

This revision is now accepted and ready to land.Sep 9 2021, 8:49 AM

Add REQUIRES: shell to the test, rebase.

mgorny added a subscriber: mgorny.Dec 12 2021, 2:01 PM

This change causes a test regression on Gentoo (where CLANG_RESOURCE_DIR="../../../../lib/clang/${clang_version}"):

FAIL: Clang :: ClangScanDeps/resource_directory.c (1961 of 29550)
******************** TEST 'Clang :: ClangScanDeps/resource_directory.c' FAILED ********************
Script:
--
: 'RUN: at line 3';   rm -rf /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp && mkdir /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp
: 'RUN: at line 4';   cp /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/Inputs/resource_directory/* /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp
: 'RUN: at line 8';   sed -e "s|CLANG|/our/custom/bin/clang|g" -e "s|DIR|/var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp|g"    /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/Inputs/resource_directory/cdb.json.template > /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/cdb_path.json
: 'RUN: at line 10';   /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/bin/clang-scan-deps -compilation-database /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/cdb_path.json --format experimental-full    --resource-dir-recipe modify-compiler-path > /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/result_path.json
: 'RUN: at line 12';   cat /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/result_path.json | sed 's:\\\\\?:/:g' | /usr/lib/llvm/14/bin/FileCheck /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/resource_directory.c --check-prefix=CHECK-PATH
: 'RUN: at line 18';   chmod +x /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/compiler
: 'RUN: at line 19';   sed -e "s|CLANG|/var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/compiler|g" -e "s|DIR|/var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp|g"    /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/Inputs/resource_directory/cdb.json.template > /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/cdb_invocation.json
: 'RUN: at line 21';   /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/bin/clang-scan-deps -compilation-database /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/cdb_invocation.json --format experimental-full    --resource-dir-recipe invoke-compiler > /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/result_invocation.json
: 'RUN: at line 23';   cat /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/result_invocation.json | sed 's:\\\\\?:/:g' | /usr/lib/llvm/14/bin/FileCheck /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/resource_directory.c --check-prefix=CHECK-INVOCATION
--
Exit Code: 1

Command Output (stderr):
--
+ : 'RUN: at line 3'
+ rm -rf /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp
+ mkdir /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp
+ : 'RUN: at line 4'
+ cp /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/Inputs/resource_directory/cdb.json.template /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/Inputs/resource_directory/compiler /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/Inputs/resource_directory/mod.h /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/Inputs/resource_directory/module.modulemap /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/Inputs/resource_directory/tu.c /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp
+ : 'RUN: at line 8'
+ sed -e 's|CLANG|/our/custom/bin/clang|g' -e 's|DIR|/var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp|g' /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/Inputs/resource_directory/cdb.json.template
+ : 'RUN: at line 10'
+ /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/bin/clang-scan-deps -compilation-database /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/cdb_path.json --format experimental-full --resource-dir-recipe modify-compiler-path
+ : 'RUN: at line 12'
+ /usr/lib/llvm/14/bin/FileCheck /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/resource_directory.c --check-prefix=CHECK-PATH
+ sed 's:\\\\\?:/:g'
+ cat /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/result_path.json
/var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/resource_directory.c:14:21: error: CHECK-PATH-NEXT: expected string not found in input
// CHECK-PATH-NEXT: "/our/custom/lib{{.*}}"
                    ^
<stdin>:21:17: note: scanning from here
 "-resource-dir",
                ^
<stdin>:22:2: note: possible intended match here
 "/our/custom/bin/../../../../lib/clang/14.0.0",
 ^

Input file: <stdin>
Check file: /var/tmp/portage/sys-devel/clang-14.0.0.9999/work/clang/test/ClangScanDeps/resource_directory.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          16:  "generic", 
          17:  "-triple", 
          18:  "i686-pc-linux-gnu", 
          19:  "-fmodules-validate-system-headers", 
          20:  "-fimplicit-module-maps", 
          21:  "-resource-dir", 
next:14'0                     X~ error: no match found
          22:  "/our/custom/bin/../../../../lib/clang/14.0.0", 
next:14'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:14'1      ?                                                possible intended match
          23:  "-fmodule-format=obj", 
next:14'0     ~~~~~~~~~~~~~~~~~~~~~~~~
          24:  "-fmodules-strict-context-hash", 
next:14'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          25:  "-fmodules-cache-path=/var/tmp/portage/sys-devel/clang-14.0.0.9999/work/x/y/clang-abi_x86_32.x86/test/ClangScanDeps/Output/resource_directory.c.tmp/cache", 
next:14'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          26:  "-isystem", 
next:14'0     ~~~~~~~~~~~~~
          27:  "/our/custom/bin/../../../../lib/clang/14.0.0/include", 
next:14'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
           .
           .
>>>>>>

--

********************

This change causes a test regression on Gentoo (where CLANG_RESOURCE_DIR="../../../../lib/clang/${clang_version}"):

Thanks for reporting this. Should be fixed in D115628.