This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Perform system include extraction inside CommandMangler
ClosedPublic

Authored by nridge on Sep 13 2022, 1:18 AM.

Details

Summary

It needs to run after edits from config files are applied to
the compile command (because the config file may specify the
compiler), and before resolveDriver() runs at the end of
CommandMangler.

As part of this change, QueryDriverDatabase is renamed to
SystemIncludeExtractor and is no longer a GlobalCompilationDatabase.

Fixes https://github.com/clangd/clangd/issues/1089
Fixes https://github.com/clangd/clangd/issues/1173
Fixes https://github.com/clangd/clangd/issues/1263

Diff Detail

Event Timeline

nridge created this revision.Sep 13 2022, 1:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 13 2022, 1:18 AM
Herald added a subscriber: arphaman. · View Herald Transcript
nridge requested review of this revision.Sep 13 2022, 1:18 AM

So far I've only written a test for https://github.com/clangd/clangd/issues/1089.

I plan to add tests for the other two issues as well, but I thought I'd get the review ball rolling in the mean time to get feedback on the fix approach and test approach.

nridge added inline comments.Sep 13 2022, 1:26 AM
clang-tools-extra/clangd/unittests/ClangdLSPServerTests.cpp
242 ↗(On Diff #459672)

To exercise QueryDriverDatabase, this test assumes that the machine running the test will have a gcc somewhere in its path, and that querying it for includes will result in at least one -isystem flag being added to the command line.

Is this a reasonable assumption for buildbots? Or do we need to introduce some kind of abstraction so that the test doesn't actually try to find and execute gcc?

thanks, i agree that we should make query driver work coherently with CompileFlags.Compiler, but I am not sure if we really need to complexity for handling this adjustment logic before resolving the driver. I've got some comments right next to the change but to summarize to possible interactions i can think of:

  • User has an absolute path either through compile commands.json or config, which means resolve logic is going to be no-op (well apart form symlink resolution, which I believe is tricky/rare enough that we shouldn't try to special case things for the sake of it).
  • We've got just the driver name, and it's generic like gcc, I don't think we should do anything custom here (I didn't go through all the bugs you've linked, so PLMK if there are common counter examples to this) because default clang heuristics and gcc heuristics we would get out of a system-installed gcc should be pretty much the same. and if the user has a custom system installed gcc, they can either modify their compile commands.json or update clangd config to provide the absolute path of the compiler. I am saying this is rare, because usually those custom toolchains have the target in the binary name.
  • We've got a custom driver name, e.g. arm-none-eabi-gcc. I think what we today is again the desired behaviour, we might try to look for it inside the directory mentioned in the compile commands though, when it isn't found inside $PATH.

Another question is actually about performing the driver resolution before/after we apply the user-supplied edits. in theory we might end up overriding a user provided CompileFlags.Compiler and I don't really have a good sense of what would be desired here.

clang-tools-extra/clangd/CompileCommands.cpp
309

why is this necessary? i.e. are there cases where heuristics in clang-driver today cannot identify the same set of includes we would extract from a system installed gcc ?

because if we were to get rid of this requirement, we can just run this as an extra arg adjuster after running command mangler, without changing the logic in CommandMangler at all (and also rendering the underlying patch unnecessary (i am still not sure if it's needed, but need to take a closer look there)).

clang-tools-extra/clangd/unittests/ClangdLSPServerTests.cpp
242 ↗(On Diff #459672)

i think it's better to have this as a lit test rather than a unittest. that way we can set up the temp env in the lit test with appropriate paths and check for the output of clangd --check=foo.cc. (we should also have a mock shell script called gcc that'll print out expected lines, we already have this in clang-tools-extra/clangd/test/system-include-extractor.test

because this doesn't only rely on existence of gcc on buildbots, but also anyone that wants to run clangd-check locally (and might interact weirdly depending on where/which gcc is installed).

I think the sticking point for just having QueryDriverDatabase run after the entirety of CommandMangler is this check in resolveDriver(), which replaces e.g. gcc with /path/to/clang/bin/gcc, which likely does not actually exist (i.e. QueryDriverDatabase will not be able to query it, when it might have been able to look up gcc in PATH).

Do you know what the purpose of this logic in resolveDriver() is, and in particular:

  • why is gcc treated differently than say gcc-12?
  • why is turning gcc into /path/to/clang/bin/gcc performed before trying to resolve gcc against PATH?

If we can change the order of the checks in resolveDriver() such that we try to resolve gcc against PATH first, and only turn it into /path/to/clang/bin/gcc if it was not found in PATH, I think that would not interfere with QueryDriverDatabase.

I think the sticking point for just having QueryDriverDatabase run after the entirety of CommandMangler is this check in resolveDriver(), which replaces e.g. gcc with /path/to/clang/bin/gcc, which likely does not actually exist (i.e. QueryDriverDatabase will not be able to query it, when it might have been able to look up gcc in PATH).

Do you know what the purpose of this logic in resolveDriver() is, and in particular:

  • why is gcc treated differently than say gcc-12?
  • why is turning gcc into /path/to/clang/bin/gcc performed before trying to resolve gcc against PATH?

We thought generic driver names would actually imply "not caring" about the compiler, hence it would be better to just use installation that came with clang, if we were able to detect any. this extremely helpful on macs, i think without this we won't have a working setup by default. since the clang coming from $PATH doesn't have stdlib installed, whereas clang we discover through xcode has it.

If we can change the order of the checks in resolveDriver() such that we try to resolve gcc against PATH first, and only turn it into /path/to/clang/bin/gcc if it was not found in PATH, I think that would not interfere with QueryDriverDatabase.

Right, that would be one way to go. But as explained above, I think this is actually making most of the "don't care" cases work by default and it wouldn't be great to give them up, without understanding the tradeoff.
Hence I was trying to understand what exactly are we supporting by not doing this resolution eagerly. i.e.:

  • what are the cases in which we have a generic driver name in the compile commands, and user indeed has all the necessary system headers installed relative to resolved path of that generic compiler name in $PATH
  • do we lose anything by not resolving a driver name provided by CompileFlags.Compiler.

Kadir and I discussed this a bunch today. FWIW I agree with this patch: sticking the query-driver step in the middle of the pipeline is the right way to solve these bugs.
Putting it on the end and hoping to get away with it feels risky to me.

clang-tools-extra/clangd/ClangdLSPServer.cpp
468–469

this is carefully preserving existing behavior: --query-driver has no effect when --compile_commands_from=lsp

I don't think that behavior is intended, we can remove it either here or in a followup patch

clang-tools-extra/clangd/ClangdLSPServer.h
79 ↗(On Diff #459672)

(if the unit tests gets converted back to a lit test, we should probably remove this)

clang-tools-extra/clangd/CompileCommands.cpp
309

I think Kadir is suggesting allowing a bug (handling of gcc + --query-driver) that we think is unimportant, in order to simplify the implementation.

After staring at this for a while, I think having the CommandMangler encapsulate all the ugly stuff we do here is the right *interface*, and that justifies the implementation being a bit ugly.

clang-tools-extra/clangd/CompileCommands.h
54

caches notwithstanding, CommandMangler is a "simple struct" - I think this field should be public for consistency.
(And set manually rather than passed to detect() - there's no detection involved)

clang-tools-extra/clangd/QueryDriverDatabase.cpp
318

this is renaming the class, but not the file, the flag, etc.
The actual name aside, such a rename should probably be more complete and a separate commit. If we keep the --query-driver flag we'll have at least two names, so we have to weigh that against the betterness of the name too.

(I agree that the current name is somewhat confusing. The new name is somewhat inaccurate: we're extracting from a compiler rather than the (operating?) system, and we extract more than includes. I prefer the new one on balance but maybe ToolchainFlagsExtractor or something would be even better...)

clang-tools-extra/clangd/unittests/ClangdLSPServerTests.cpp
74 ↗(On Diff #459672)

The new destruction order looks much less obvious, even if we can get away with it at the moment.
Maybe add an accessor instead?

Thanks for the feedback! Yeah I'd be wary of introducing a corner case where the user passes --query-driver, and there is in fact a driver available to query in PATH, but we end up not querying it. Even if the outcome is the same in cases we can think of, it feels like a footgun that's going to bite some user somewhere.

clang-tools-extra/clangd/QueryDriverDatabase.cpp
318

Mostly I wanted to drop the Database from QueryDriverDatabase because it was no longer a GlobalCompilationDatabase, and just QueryDriver didn't sound right (a verb, not a noun).

I do see the value in sticking to the terminology already used in the flag. If you're happy with the CompileCommandsAdjuster abstraction (and its name), we could make it QueryDriverAdjuster?

Thanks for the feedback! Yeah I'd be wary of introducing a corner case where the user passes --query-driver, and there is in fact a driver available to query in PATH, but we end up not querying it. Even if the outcome is the same in cases we can think of, it feels like a footgun that's going to bite some user somewhere.

Just to give my perspective here to close the loop, I agree that there could be users effected from this I just wasn't sure if that would be common enough (+ we have workarounds by specifying the absolute path in config, database etc.) to justify both performing all the necessary review and maintenance afterwards. I think we've agreed on a somewhat less intrusive way for the change with Sam for the maintenance efforts, and we've already paid the review costs anyways. So I am also fine with going down this path, and hopefully not ending up seeing some surprises on dark corners of the universe.

nridge updated this revision to Diff 462704.Sep 25 2022, 1:37 AM
nridge marked 2 inline comments as done.

This is mostly just a rebase on top of the update to D133756 to turn
CompileCommandsAdjuster into a unique_function, and addressing a few
minor comments.

I still need to re-formulate the test to use lit tests, I will work
on that next.

I think this patch is great apart from the testing issue.

(I do see the value in a good end-to-end test for the exact bug being fixed, but would also be fine with this being a unit-test of CommandMangler with the query driver part mocked out)

clang-tools-extra/clangd/QueryDriverDatabase.cpp
318

Mostly I wanted to drop the Database from QueryDriverDatabase

Yup, makes sense. We should drop it from the filename too, right? On reflection renaming the class in this patch and the file in a subsequent one also seems reasonable.

If you're happy with the CompileCommandsAdjuster abstraction (and its name), we could make it QueryDriverAdjuster?

As mentioned on the other patch I'm not really happy with the Adjuster name, too much trauma from tooling::ArgumentsAdjuster :-)
Having thought about it more, I like SystemIncludeExtractor a lot, even if it doesn't totally cover everything.

So I'd prefer either:

  • best name: rename everything to SystemIncludeExtractor except the flag. (i.e. this patch, with a followup commit renaming the file)
  • most consistent: rename everything to QueryDriver - it's a verb, but it's also a functor

up to you

clang-tools-extra/clangd/unittests/ClangdLSPServerTests.cpp
242 ↗(On Diff #459672)

Agree, I know this is a pain but it's not just buildbots: people run these tests when building official llvm releases or distro packages for all sorts of systems, and I think a dep on gcc might be a problem. (e.g. on windows, but also I think a basic FreeBSD install has clang but not gcc these days?)

nridge updated this revision to Diff 471925.Oct 31 2022, 1:57 AM

reformulate test as lit test
address other review comments

nridge retitled this revision from [clangd] Turn QueryDriverDatabase into a CompileCommandsAdjuster to [clangd] Perform system include extraction inside CommandMangler.Oct 31 2022, 1:58 AM
nridge edited the summary of this revision. (Show Details)
nridge added a comment.EditedOct 31 2022, 2:03 AM

I believe this addresses the remaining review comments. I will follow up with a patch to rename QueryDriverDatabase.cpp to SystemIncludeExtractor.cpp.

clang-tools-extra/clangd/QueryDriverDatabase.cpp
318

So I'd prefer either:

  • best name: rename everything to SystemIncludeExtractor except the flag. (i.e. this patch, with a followup commit renaming the file)
  • most consistent: rename everything to QueryDriver - it's a verb, but it's also a functor

up to you

I went with the first one.

A couple of minor deviations from what you suggested in this comment are:

  1. I declared a typedef for the unique_function at namespace scope rather than inside CommandMangler, so that I can use it in the declaration of getSystemIncludeExtractor()
  2. I named the typedef SystemIncludeExtractorFn to avoid a conflict with the name of the actual implementing class

Let me know if you prefer some other choices here.

nridge added inline comments.Oct 31 2022, 2:11 AM
clang-tools-extra/clangd/test/system-include-extractor.test
85

Ugh, this doesn't quite work in that whether we get diagnostics for the user config file depends on whether it exists. It happens to on my system but presumably it may not on a buildbot?

Should I create a user config file in the lit test so it reliably exists (and delete it at the end)? Or is there some FileCheck magic I can employ that would ignore an optional second publishDiagnostics which is not the interesting one?

nridge updated this revision to Diff 472235.Nov 1 2022, 1:38 AM

Improve handling of config file diagnostics in lit test

nridge added inline comments.Nov 1 2022, 1:40 AM
clang-tools-extra/clangd/test/system-include-extractor.test
85

Ok, I figured out a way to handle this properly:

  • I included clangd's stderr in the FileCheck input so we could match lines from stderr like "ASTWorker building file"
  • By first looking for "ASTWorker building file" and then the next publishDiagnostics, we can be sure that the diagnostics pertain to the source file and not a config file

I will follow up with a patch to rename QueryDriverDatabase.cpp to SystemIncludeExtractor.cpp.

Posted in D137401

sammccall accepted this revision.Nov 7 2022, 6:48 AM

Woohoo, thanks!

This revision is now accepted and ready to land.Nov 7 2022, 6:48 AM
This revision was landed with ongoing or failed builds.Nov 7 2022, 3:03 PM
This revision was automatically updated to reflect the committed changes.
ayzhao added a subscriber: ayzhao.Nov 8 2022, 2:35 PM

This change is causing system-include-extractor.test to fail on Chrome: https://crbug.com/1382508

The test fails on this assertion if the path to the LLVM repository contains the plus sign (+). The actual output replaces the plus sign with the URL escape code %2B, but the still expects the plus sign in the output.

nridge added a comment.Nov 8 2022, 2:45 PM

This change is causing system-include-extractor.test to fail on Chrome: https://crbug.com/1382508

The test fails on this assertion if the path to the LLVM repository contains the plus sign (+). The actual output replaces the plus sign with the URL escape code %2B, but the still expects the plus sign in the output.

Thanks for the report, I sent out D137674 to work around this.

This change is causing system-include-extractor.test to fail on Chrome: https://crbug.com/1382508

The test fails on this assertion if the path to the LLVM repository contains the plus sign (+). The actual output replaces the plus sign with the URL escape code %2B, but the still expects the plus sign in the output.

Thanks for the report, I sent out D137674 to work around this.

Oops, I landed a slightly worse version in 96d771c4707df9b3153ea5a8451f2bbbe17c8917 before seeing your message. @ayzhao it should be fixed at head.

ayzhao added a comment.Nov 8 2022, 2:53 PM

This change is causing system-include-extractor.test to fail on Chrome: https://crbug.com/1382508

The test fails on this assertion if the path to the LLVM repository contains the plus sign (+). The actual output replaces the plus sign with the URL escape code %2B, but the still expects the plus sign in the output.

Thanks for the report, I sent out D137674 to work around this.

Oops, I landed a slightly worse version in 96d771c4707df9b3153ea5a8451f2bbbe17c8917 before seeing your message. @ayzhao it should be fixed at head.

Thanks!