This is an archive of the discontinued LLVM Phabricator instance.

Add support to find out resource dir and add it as compilation args
ClosedPublic

Authored by kousikk on Oct 17 2019, 10:57 AM.

Details

Summary

If -resource-dir is not specified as part of the compilation command, then by default
clang-scan-deps picks up a directory relative to its own path as resource-directory.
This is probably not the right behavior - since resource directory should be picked relative
to the path of the clang-compiler in the compilation command.
This patch adds support for it along with a cache to store the resource-dir paths based on
compiler paths.

Notes:

  1. "-resource-dir" is a behavior that's specific to clang, gcc does not have that flag. That's why if I'm not able to find a resource-dir, I quietly ignore it.
  2. Should I also use the mtime of the compiler in the cache? I think its not strictly necessary since we assume the filesystem is immutable.
  3. From my testing, this does not regress performance.
  4. Will try to get this tested on Windows.

But basically the problem that this patch is trying to solve is, clients might not always want to specify
"-resource-dir" in their compile commands, so scan-deps must auto-infer it correctly.

Event Timeline

kousikk created this revision.Oct 17 2019, 10:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 17 2019, 10:57 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
kousikk edited the summary of this revision. (Show Details)Oct 17 2019, 10:58 AM
kousikk edited the summary of this revision. (Show Details)
jkorous added inline comments.Oct 17 2019, 3:19 PM
clang/tools/clang-scan-deps/ClangScanDeps.cpp
71

I haven't looked into this in detail but it feels kind of wasteful to start a process just to get this value. Can't we just expose it in some of the clang libs we already link against?

@kousikk could we please add support for generic -extra-arg, which can then add -resource-dir like the other clang tools?

kousikk added a comment.EditedOct 17 2019, 5:24 PM

I haven't looked into this in detail but it feels kind of wasteful to start a process just to get this value.

Right, that's why I cache this information. So multiple compile commands sharing the same compiler process will trigger at most 1 subprocess and then subsequently use the cached information.

Can't we just expose it in some of the clang libs we already link against?

The problem is, the value of resource-dir depends on CLANG_VERSION and ARCH #define's, that's set when the clang binary is built. So lets say you use /home/kousikk/clang/7.0/bin/clang, then if your compilation database is:

[
  { "command": "/home/kousikk/clang/7.0/bin/clang -c /home/kousikk/test.cpp",
    "file": "/home/kousikk/test.cpp",
    "directory": "/home/kousikk/"
  }
]

then the value of -resource-dir for compilation of test.cpp must be something like /home/kousikk/clang/7.0/lib64/clang/7.0.0/. Any libs used as part of building clang-scan-deps itself is going to point to its own resource-directory which is NOT the behaviour we want.


could we please add support for generic -extra-arg, which can then add -resource-dir like the other clang tools?

@arphaman would this be an option to clang-scan-deps? If so it might not work since commands in the compilation-database can be using more than 1 version of clang?

In addition, every project on which we use scan-deps will inturn have to specify this extra argument - wouldn't it be better if they didn't have to do that?

The current assumption is that the clang-scan-deps binary is the one that comes next to the clang binary you are using. There are lots of other differences between clang versions than just the resource-dir.

MaskRay added inline comments.
clang/lib/Tooling/ArgumentsAdjusters.cpp
43 ↗(On Diff #225468)

Just delete /*unused*/

I've added Manuel as a reviewer as this patch is also changing the tooling APIs.

since resource directory should be picked relative
to the path of the clang-compiler in the compilation command.

The resource dir should be the resource dir that shipped with the clang source code that the *tool* was built with.
We can think about the resource dir as files that should really have been compiled into the tool.
If the compiler has a different version, its resource dir might break the tool.

I discussed with @klimek about this. The conclusion we arrived at was:

Allowing override of resource-dir for all tools is not something we should do (for reasons that @klimek has outlined above). However, the dependency scanner tool can choose to override resource-dir, iff we are sure that it would work across different versions of clang.

And my reasoning for why resource-dir's behaviour when overridden in clang-scan-deps will remain the same across different versions of clang is:

  1. The tool runs only the preprocessor. The preprocessor spec hasn't changed in a backwards incompatible way as far as I know.
  2. In addition to (1), the tool doesn't use anything other than header files from resource-dir, for the purpose of dependency scanning from the preprocessor.
Bigcheese requested changes to this revision.Oct 29 2019, 3:25 PM

So I agree solving this problem makes sense, but I have some issues with the current patch.

  • This defaults to doing the lookup. The test suite, and some real outputs, just uses "clang ..." as the command, not an absolute path. If we always do the lookup then this would be whatever clang is in PATH, not the clang we're testing.
  • While looking into the above, I think I discovered that the way FindResourceDir constructs ClangBinaryPath is broken. It uses the "directory" entry from the compilation database, which is the working directory of where the compile occurred. It should be extremely rare for this ever to contain a clang, and even rarer for it to be the one that was intended.
  • This touches a lot of clang/lib/Tooling that I don't think it needs to.

For the first issue we may just want to fix the tests to point to a specific clang, it's annoying, but doable. An alternative would be to only do FindResourceDir when given an absolute path to a compiler, instead of trying to guess. I think I prefer this approach, as if you have an installed clang in your path, there should be a clang-scan-deps next to it, which would use the same resource dir.

For the second and third issues, I think you can just remove the StringRef Directory arg and use FindInEnvPath to get the correct clang.

You shouldn't need it anymore, but you can replace MakeAbsolutePath with llvm::sys::path::make_absolute.

This revision now requires changes to proceed.Oct 29 2019, 3:25 PM
kousikk updated this revision to Diff 227600.Nov 2 2019, 10:10 PM
kousikk marked an inline comment as done.

@Bigcheese - thanks, all the three points make sense!

An alternative would be to only do FindResourceDir when given an absolute path to a compiler, instead of trying to guess. I think I prefer this approach, as if you have an installed clang in your path, there should be a clang-scan-deps next to it, which would use the same resource dir.

I really like the idea of adding resource-dir only if the given clang path is an absolute path.
As you pointed out, this probably solves all the three concerns you mentioned.

Updated the patch with the change and reverted changes to clang/lib/Tooling.

kousikk marked an inline comment as done.Nov 2 2019, 10:10 PM
kousikk updated this revision to Diff 227601.Nov 2 2019, 10:12 PM

Forgot to run clang-format

Bigcheese accepted this revision.Nov 13 2019, 4:01 PM

lgtm with the changes to FindResourceDir.

clang/tools/clang-scan-deps/ClangScanDeps.cpp
50

This doesn't need to return a std::string as the lifetime of the returned string is the same as the lifetime of ResourceDirectoryCache::Cache. It can be a StringRef instead.

50

FindResourceDir should be findResourceDir.

75–79

It would be nice if this didn't need to be a real file, but it looks like llvm::sys::Execute can only handle file paths.

This revision is now accepted and ready to land.Nov 13 2019, 4:01 PM
kousikk updated this revision to Diff 230508.Nov 21 2019, 12:27 PM
kousikk marked 4 inline comments as done.

Address review comments

clang/tools/clang-scan-deps/ClangScanDeps.cpp
75–79

Yup - I'm going to leave it as such for now.

Thanks, you're good to commit.

kousikk closed this revision.Nov 21 2019, 1:08 PM
kousikk reopened this revision.Nov 21 2019, 1:22 PM
This revision is now accepted and ready to land.Nov 21 2019, 1:22 PM
This revision was automatically updated to reflect the committed changes.