This is an archive of the discontinued LLVM Phabricator instance.

[Tooling] Handle compilation databases containing commands with double dashes
ClosedPublic

Authored by jnykiel on Mar 17 2021, 3:48 PM.

Details

Summary

As of CMake commit https://gitlab.kitware.com/cmake/cmake/-/commit/d993ebd4,
which first appeared in CMake 3.19.x series, in the compile commands for
clang-cl, CMake puts -- before the input file. When operating on such a
database, the InterpolatingCompilationDatabase - specifically, the
TransferableCommand constructor - does not recognize that pattern and so, does
not strip the input, or the double dash when 'transferring' the compile command.
This results in a incorrect compile command - with the double dash and old input
file left in, and the language options and new input file appended after them,
where they're all treated as inputs, including the language version option.

Test files for some tests have names similar enough to be matched to commands
from the database, e.g.:

.../path-mappings.test.tmp/server/bar.cpp

can be matched to:

.../Driver/ToolChains/BareMetal.cpp

etc. When that happens, the tool being tested tries to use the matched, and
incorrectly 'transferred' compile command, and fails, reporting errors similar
to:

error: no such file or directory: '/std:c++14'; did you mean '/std:c++14'? [clang-diagnostic-error]

This happens in at least 4 tests:

Clang Tools :: clang-tidy/checkers/performance-trivially-destructible.cpp
Clangd :: check-fail.test
Clangd :: check.test
Clangd :: path-mappings.test

The fix for TransferableCommand removes the -- and everything after it when
determining the arguments that apply to the new file. -- is inserted in the
'transferred' command if the new file name starts with - and when operating in
clang-cl mode, also /. Additionally, other places in the code known to do
argument adjustment without accounting for the -- and causing the tests to
fail are fixed as well.

Diff Detail

Event Timeline

jnykiel created this revision.Mar 17 2021, 3:48 PM
jnykiel requested review of this revision.Mar 17 2021, 3:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2021, 3:48 PM
jnykiel edited the summary of this revision. (Show Details)Mar 17 2021, 3:51 PM
aganea added inline comments.Mar 17 2021, 4:25 PM
clang/lib/Tooling/InterpolatingCompilationDatabase.cpp
181

thanks for looking into this!

we also had a related report about this in clangd, but didn't have time to dive into it. this is definitely fixing an issue in interpolating compilation database, but unfortunately there are still issues lurking around in clangd and command line adjusters.

do you mind fixing those cases too? if not i can also make some follow-ups, but as-is this isn't really useful to tools, at least without the adjuster fixes i mentioned in the first item(as they are used by almost all clang-based tools), the rest is specific to clangd.

clang/lib/Tooling/InterpolatingCompilationDatabase.cpp
182

isn't everything after double dash treated as inputs? i think this should be a break instead of a continue.

244

i think it actually makes sense to insert a double dash here too, if the filename starts with a / or - they might actually be treated as compile flags rather than filenames.

jnykiel updated this revision to Diff 331570.Mar 18 2021, 8:29 AM
jnykiel retitled this revision from [Tooling] Handle compilation databases with clang-cl commands generated by CMake 3.19+ to [Tooling] Handle compilation databases containing commands with double dashes.
jnykiel edited the summary of this revision. (Show Details)

I have addressed some of your comments. The second revision is good enough to get the tests to pass - the injectResourceDir change was necessary for the one clang-tidy test that started failing after leaving the -- in. I don't feel familiar enough with clangd specifically to attempt comprehensive fixes there. The clangd issue report specifically mentions -fsyntax-only and -resource-dir as well - so while not comprehensive, the fix may be good for that too.

jnykiel updated this revision to Diff 331605.Mar 18 2021, 10:08 AM

Fixed logic error in injectResourceDir causing a test failure on the build machine.

I have addressed some of your comments. The second revision is good enough to get the tests to pass - the injectResourceDir change was necessary for the one clang-tidy test that started failing after leaving the -- in. I don't feel familiar enough with clangd specifically to attempt comprehensive fixes there. The clangd issue report specifically mentions -fsyntax-only and -resource-dir as well - so while not comprehensive, the fix may be good for that too.

There are a lot more argument adjusters that look at the compile flags and they don't stop at -- though :/ I don't think it is feasible to special case -- in all of those loops and doesn't happen enough in practice (i.e. we see a filename that starts with -resource-dir or -fsyntax-only etc.) to justify a more complex solution. So as mentioned in the comments let's ignore these for now.

clang/lib/Tooling/ArgumentsAdjusters.cpp
47

we should perform this either in all or none. thinking about this again, there are a lot more things that can go wrong but they happen pretty rarely so let's just drop this change and replace the push_back below with getInsertArgumentAdjuster("-fsyntax-only")(AdjustedArgs, "");

copying over args once more isn't crucial here as this happen only once before parsing a translation unit, which by far dominates the latency.

clang/lib/Tooling/InterpolatingCompilationDatabase.cpp
243

the condition should be a function of the Filename, not the compile flag we are using. as the original command might not contain the -- but the filename might still be starting with a - or / here, right?

So i'd suggest Filename.startswith("-") || (ClangCLMode && Filename.startswith("/"))

clang/lib/Tooling/Tooling.cpp
439

similar to comment above, let's just ignore this.

445

nit:

Args = getInsertArgumentAdjuster(("-resource-dir=" + CompilerInvocation::GetResourcesPath(Argv0, MainAddr)).c_str())(Args, "");

There's an overload of getInsertArgumentAdjuster that takes in a const char*

jnykiel updated this revision to Diff 331884.Mar 19 2021, 8:23 AM
jnykiel edited the summary of this revision. (Show Details)

I've addressed your comments (I agree with them, essentially). After implementing the insertion of '--' for problematic file names, I had to modify some interpolation unit tests to get them to consistently pass on both Windows and Unix-like systems - I made it possible to match against a relative 'non-native' path in a test, as the absolute paths in this unit test fixture always start with a drive letter on Windows, thus making it impossible to test the '-' cases there, and always start with a '/' on Unix-like systems, making the expected string different when run on Windows or Unix-like.

kadircet accepted this revision.Mar 22 2021, 1:18 AM

thanks, lgtm!

one final nit: getInsertArgumentAdjuster defaults the position to ArgumentInsertPosition::END when there's a single argument to insert. so you can drop those.

This revision is now accepted and ready to land.Mar 22 2021, 1:18 AM

(also please let me know if I should land this for you, preferably with an email address to attribute the commit to)

jnykiel updated this revision to Diff 332268.Mar 22 2021, 6:26 AM

Thanks. About committing/attribution, I'm not sure how it's supposed to work - I've asked a co-worker (@aganea) to commit this, but if he can't do it, please do it instead. I'm Janusz Nykiel, email address janusz.nykiel at ubisoft.com

SG, will wait for Alexandre to land this then!