This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Move standard options adaptor to CommandMangler
ClosedPublic

Authored by DmitryPolukhin on Feb 6 2023, 2:39 PM.

Details

Summary

There is a discrepancy between how clangd processes CDB loaded from
JSON file on disk and pushed via LSP. Thus the same CDB pushed via
LSP protocol may not work as expected. Some difference between these two
paths is expected but we still need to insert driver mode and target from
binary name and expand response files.

Part of the diff addTargetAndModeForProgramName came from D138546

Test Plan: check-clang-tools

Diff Detail

Event Timeline

DmitryPolukhin created this revision.Feb 6 2023, 2:39 PM
DmitryPolukhin requested review of this revision.Feb 6 2023, 2:39 PM

Fix clang-format sources

nridge added a subscriber: nridge.Feb 11 2023, 7:58 PM

Note that D138546 is removing the call to inferTargetAndDriverMode() from GlobalCompilationDatabase.cpp. It adds equivalent logic too CommandMangler which is used for CDBs pushed from LSP as well, so it accomplishes objective of this patch with respect to inferTargetAndDriverMode (but not inferMissingCompileCommands).

Note that D138546 is removing the call to inferTargetAndDriverMode() from GlobalCompilationDatabase.cpp. It adds equivalent logic too CommandMangler which is used for CDBs pushed from LSP as well, so it accomplishes objective of this patch with respect to inferTargetAndDriverMode (but not inferMissingCompileCommands).

@nridge I think D138546 solves the issue only partially and we need both patches applied. inferMissingCompileCommands and expandResponseFiles also needs to be applied to command pushed via LSP. I can rebase on top of D138546 if it will be landed first and update test. Please advise which patch should land first?

There is a discrepancy between how clangd processes CDB loaded from JSON file on disk and pushed via LSP.

That's actually because we model compile commands pushed via LSP as a "fixed compilation database" rather than a json compilation database (you can check the code in parseFixed, near parseJson).
The reason behind the discrepancy between fixed and json compilation databases mostly arises from the fact that the former is written by people specifically to be used by clang-tooling, whereas the latter is usually provided by build system and doesn't necessarily have the same concerns as clang-tooling hence requires certain modifications.

That being said, the two particular transformations (response files & target/mode inference) seem like outliers when it comes to such transformations. These are handled by clang-driver binary, outside of the driver library, in https://github.com/llvm/llvm-project/blob/main/clang/tools/driver/driver.cpp#L426 and https://github.com/llvm/llvm-project/blob/main/clang/tools/driver/driver.cpp#L534. Therefore all the other tools that wants to make use of clang infra tries to simulate this logic, at different layers with slight differences (at the bare minimum, as the logic isn't shared and people mostly care about the driver, logic in other places just gets out-of-date).
I believe the right thing to do here is putting all that arg preparation logic into driver library itself, so that all code paths that tries to create a compiler invocation can have the same behaviour. Unfortunately this is likely too hard to pull at this stage, due to people relying on almost every implementation detail of these interfaces.
So a middle ground would probably involve, moving that logic inside driver binary to a common library, so that future users can at least directly use it rather than trying to come up with their own interpretation (or trying to chose from existing N patterns) and we'll get rid of the issues that result from logic in this place and its duplicates getting out-of-sync. This way we don't need to migrate all the applications that want to create a compiler invocation to a new API and can only migrate clangd (CommandMangler is I believe the right layer for these purposes. as it handles all "string"-like modifications on the compile flags today).

Does that make sense? Is it something you'd like to propose a patch for? Because as things stand I think we're just making the matter worse here by introducing some new code paths that're trying to emulate the logic in driver today and will get out of sync at some point.

clang-tools-extra/clangd/ClangdLSPServer.cpp
1350 ↗(On Diff #495431)

it looks like this comparison is no longer valid.

OverlayCDB::getCompileCommand will apply more modifications to the stored flags than just expanding response files and inferring targets. I believe the intent of this interaction was to treat compile flags from the LSP client as-is without any modifications (as I explained more in my main comment).

No action needed right now, just thinking out-loud. I think the proper thing to do here is apply these "common driver transformations" here, and store/use the flags as is going forward inside clangd, without extra mangling.

That's actually because we model compile commands pushed via LSP as a "fixed compilation database" rather than a json compilation database (you can check the code in parseFixed, near parseJson).
The reason behind the discrepancy between fixed and json compilation databases mostly arises from the fact that the former is written by people specifically to be used by clang-tooling, whereas the latter is usually provided by build system and doesn't necessarily have the same concerns as clang-tooling hence requires certain modifications.

Thank you for sharing the context that I didn't know. I think it makes some sense but it requires duplication of this logic in some other places and keeping them up-to-date with driver changes.

That being said, the two particular transformations (response files & target/mode inference) seem like outliers when it comes to such transformations. These are handled by clang-driver binary, outside of the driver library, in https://github.com/llvm/llvm-project/blob/main/clang/tools/driver/driver.cpp#L426 and https://github.com/llvm/llvm-project/blob/main/clang/tools/driver/driver.cpp#L534. Therefore all the other tools that wants to make use of clang infra tries to simulate this logic, at different layers with slight differences (at the bare minimum, as the logic isn't shared and people mostly care about the driver, logic in other places just gets out-of-date).

I think clangd should be able to make this transformations if they are required even for "fixed" flags and it shouldn't change anything if compilation flags don't require them.

I believe the right thing to do here is putting all that arg preparation logic into driver library itself, so that all code paths that tries to create a compiler invocation can have the same behaviour. Unfortunately this is likely too hard to pull at this stage, due to people relying on almost every implementation detail of these interfaces.
So a middle ground would probably involve, moving that logic inside driver binary to a common library, so that future users can at least directly use it rather than trying to come up with their own interpretation (or trying to chose from existing N patterns) and we'll get rid of the issues that result from logic in this place and its duplicates getting out-of-sync. This way we don't need to migrate all the applications that want to create a compiler invocation to a new API and can only migrate clangd (CommandMangler is I believe the right layer for these purposes. as it handles all "string"-like modifications on the compile flags today).

Does that make sense? Is it something you'd like to propose a patch for? Because as things stand I think we're just making the matter worse here by introducing some new code paths that're trying to emulate the logic in driver today and will get out of sync at some point.

I see your point and it does make sense to me. I'll try to move logic to CommandMangler instead so both codepaths will share it. As for keeping up with driver changes, it is real issue and I'm not sure that we can solve this problem without using driver code almost "as is" that is not possible without serious refactoring. I'll ping you when changes are ready for review.

clang-tools-extra/clangd/ClangdLSPServer.cpp
1350 ↗(On Diff #495431)

I think it won't hurt anything and it was not very useful even before my change. Now it just prevent small extra work when you set the same arguments more than one on arguments are very simple so no transformation actually happening.

Move standard adaptors to CommandMangler

nridge added inline comments.Feb 16 2023, 8:44 AM
clang-tools-extra/clangd/CompileCommands.cpp
216

The target needs to be added after the call to SystemIncludeExtractor later in this function (this is what D138546 is trying to fix). The reason is that SystemIncludeExtractor includes any --target flag in the compiler driver being queried for system includes, which may be gcc, which does not support --target.

nridge added inline comments.Feb 16 2023, 8:47 AM
clang-tools-extra/clangd/CompileCommands.cpp
216

(I guess we could make that change separately in D138546, but if we're changing the place where the --target is added in this patch, I figure we might as well move it directly to the desired place.)

Add test for expanded response files

@nridge I'm sorry, I pushed version without all tests. Now I added test case for response files, it seems that clangd hasn't had it before this patch.

clang-tools-extra/clangd/CompileCommands.cpp
216

I think there are order problems here:

  • we need --driver-mode=cl injected here to make check on line #229 work as expected
  • if we don't inject it driver mode here, command line edits won't see the change; see how I modified test clangd/unittests/CompileCommandsTests.cpp line #203, with D138546 edits were not applied properly to driver mode

I'll double check how it works on Windows but I expect that moving it after SystemIncludeExtractor will break proper driver mode detection.

Rebase and retest, cannot reproduce test failure locally

@nridge @kadircet please take another look, all tests now pass. As for the location of inserting driver mode please see my previous comment about MSVC CL check.

nridge added inline comments.Feb 20 2023, 11:48 PM
clang-tools-extra/clangd/CompileCommands.cpp
216

I think there are order problems here:

  • we need --driver-mode=cl injected here to make check on line #229 work as expected

Looking at the line in question, it calls driver::getDriverMode() which falls back on extracting the driver mode from the program name anyways -- so I think that should be fine.

  • if we don't inject it driver mode here, command line edits won't see the change; see how I modified test clangd/unittests/CompileCommandsTests.cpp line #203, with D138546 edits were not applied properly to driver mode

I'm not following the motivation for this test. Is there any real-world use case which requires the command line edits seeing the --driver-mode parameter?

DmitryPolukhin added inline comments.Feb 21 2023, 5:50 AM
clang-tools-extra/clangd/CompileCommands.cpp
216

I'm not following the motivation for this test. Is there any real-world use case which requires the command line edits seeing the --driver-mode parameter?

@nridge, it will break existing behaviour when user was able to remove these inserted options like this CompileFlags: {Remove: ['--driver-mode=*', '--target=*']}. If I move call of addTargetAndModeForProgramName after SystemIncludeExtractorr, inserted options will stay in list. User may only override it with CompileFlags.Compiler but it will override all compilers. I don't know is it real problem or not so if you think I should still move it after SystemIncludeExtractor, I'll do it.

nridge added inline comments.Feb 21 2023, 7:32 PM
clang-tools-extra/clangd/CompileCommands.cpp
216

@nridge, it will break existing behaviour when user was able to remove these inserted options like this CompileFlags: {Remove: ['--driver-mode=*', '--target=*']}.

Thanks. I'll leave the final word to @kadircet here but in my opinion, being able to remove those flags via Remove is not important; as you say, the user can change the Compiler to influence the driver mode if needed.

Move call addTargetAndModeForProgramName after SystemIncludeExtractor

DmitryPolukhin retitled this revision from [clangd] Apply standard adaptors to CDBs pushed from LSP to [clangd] Move standard options adaptor to CommandMangler.Feb 22 2023, 2:36 AM
DmitryPolukhin edited the summary of this revision. (Show Details)
DmitryPolukhin edited the summary of this revision. (Show Details)Feb 22 2023, 2:38 AM

Failing tests seems to be just flacky, I was not able to reprocuce any issue with them locally in stress runs and under ASan and TSan

@kadircet friendly ping, could you please take a look to this diff?

@kadircet and @nridge friendly ping could you please accept this diff or let me know if you would like to change anything here?
I moved addTargetAndModeForProgramName after SystemIncludeExtractor as was suggested.

@kadircet it is obvious that there is something in this diff that causes this hesitancy in accepting it. I'm ready to keep iterating on the solution but I need a clue what needs the improvement. Please comment.

@kadircet it is obvious that there is something in this diff that causes this hesitancy in accepting it. I'm ready to keep iterating on the solution but I need a clue what needs the improvement. Please comment.

sorry for the late reply, i was on vacation and just started catching up with things. this looks great, thanks!

clang-tools-extra/clangd/CompileCommands.cpp
199

getting real filesystem is cheap, no need to make this part of the state, just inline it into tooling::addExpandedResponseFiles call.

216

agreed. addTargetAndModeForProgramName will add these flags only when they're not present. deleting --driver-mode or --target, without providing a fixed replacement doesn't sound like a real use case.

clang-tools-extra/clangd/test/Inputs/did-change-configuration-params.args
1 ↗(On Diff #502110)

you can test expandResponseFile behaviour through unittests as well, see https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clangd/unittests/CompileCommandsTests.cpp#L133 for an example. i don't think complications in this lit test are worth it. do you mind moving this to a unittest?

clang-tools-extra/clangd/unittests/ClangdTests.cpp
1240 ↗(On Diff #502110)

this feels unrelated, can you revert?

clang-tools-extra/clangd/unittests/CompileCommandsTests.cpp
53

can you include a target triplet in the compiler name to test out target inference also works as intended?

clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp
53

you can drop tooling::

DmitryPolukhin marked 4 inline comments as done.

Comments resolved

@kadircet thank you for the review! Please take another look.

clang-tools-extra/clangd/CompileCommands.cpp
199

We need version of tooling::addExpandedResponseFiles with FS as an argument for testing at least, see https://github.com/llvm/llvm-project/blob/main/clang/unittests/Tooling/CompilationDatabaseTest.cpp#L968 switching it to using real filesystem IMHO makes things worse. So moved this code to the call site.

Fix test on Windows

kadircet accepted this revision.Mar 13 2023, 5:05 AM

thanks, lgtm!

clang-tools-extra/clangd/CompileCommands.cpp
199

So moved this code to the call site.

I was also trying to say that. so all good.

This revision is now accepted and ready to land.Mar 13 2023, 5:05 AM

Had to revert this diff temporary due to broken ARM builds. It seems that X86 targets are disabled for that bots so tests are not passing. I'll fix upload new version for review.

DmitryPolukhin reopened this revision.Mar 14 2023, 4:35 AM
This revision is now accepted and ready to land.Mar 14 2023, 4:35 AM

Update test to use any available target

@kadircet could you please take another look?
I had to update diff to make it work when LLVM is configured without X86 targets so they are not valid and thus not injected into the command line.

tdupes added a subscriber: tdupes.Mar 16 2023, 11:35 AM
asb added a subscriber: asb.Mar 17 2023, 8:37 AM

This broke -DBUILD_SHARED_LIBS=True builds - I committed rG482d22d05a4a30a4f8594273bd359f7d311c9d4c to fix it.

dyung added a subscriber: dyung.Mar 17 2023, 12:15 PM

@DmitryPolukhin your change seems to be causing a crash in 2 ORC tests on a bot (trivial-cxa-atexit.S and trivial-atexit.S), can you take a look or revert if you need time to investigate?

https://lab.llvm.org/buildbot/#/builders/247/builds/2639

******************** TEST 'ORC-x86_64-linux :: TestCases/Linux/x86-64/trivial-cxa-atexit.S' FAILED ********************
Script:
--
: 'RUN: at line 3';      /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/./bin/clang   -m64  -c -o /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/projects/compiler-rt/test/orc/X86_64LinuxConfig/TestCases/Linux/x86-64/Output/trivial-cxa-atexit.S.tmp /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/compiler-rt/test/orc/TestCases/Linux/x86-64/trivial-cxa-atexit.S
: 'RUN: at line 4';   /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/./bin/llvm-jitlink -orc-runtime=/home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/./lib/clang/17/lib/x86_64-unknown-linux-gnu/liborc_rt.a /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/projects/compiler-rt/test/orc/X86_64LinuxConfig/TestCases/Linux/x86-64/Output/trivial-cxa-atexit.S.tmp
--
Exit Code: 139
Command Output (stderr):
--
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/./bin/llvm-jitlink -orc-runtime=/home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/./lib/clang/17/lib/x86_64-unknown-linux-gnu/liborc_rt.a /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/projects/compiler-rt/test/orc/X86_64LinuxConfig/TestCases/Linux/x86-64/Output/trivial-cxa-atexit.S.tmp
#0 0x00005654e9a19e86 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/./bin/llvm-jitlink+0x10aae86)
#1 0x00005654e9a17534 SignalHandler(int) Signals.cpp:0:0
#2 0x00007f57b6259420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
#3 0x00007f57758df13a 
#4 0x00007f57758e27bc 
#5 0x00007f57758f102a 
#6 0x00007f57758e875f 
#7 0x00007f57758f4de2 
/home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/projects/compiler-rt/test/orc/X86_64LinuxConfig/TestCases/Linux/x86-64/Output/trivial-cxa-atexit.S.script: line 2: 3620821 Segmentation fault      (core dumped) /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/./bin/llvm-jitlink -orc-runtime=/home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/./lib/clang/17/lib/x86_64-unknown-linux-gnu/liborc_rt.a /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/projects/compiler-rt/test/orc/X86_64LinuxConfig/TestCases/Linux/x86-64/Output/trivial-cxa-atexit.S.tmp

@DmitryPolukhin your change seems to be causing a crash in 2 ORC tests on a bot (trivial-cxa-atexit.S and trivial-atexit.S), can you take a look or revert if you need time to investigate?

My changes haven't touched anything in compiler-rt at all but I'm looking into the issue.

@dyung I cannot reproduce this issue on two different machines in both cases tests work fine. For clang and llvm-jitlink binaries (the only binaries used in the tests) my changes are non-functional and I didn't change anything in compiler-rt or ORC. Are you able to reproduce the crashes locally?

dyung added a comment.Mar 17 2023, 5:01 PM

@dyung I cannot reproduce this issue on two different machines in both cases tests work fine. For clang and llvm-jitlink binaries (the only binaries used in the tests) my changes are non-functional and I didn't change anything in compiler-rt or ORC. Are you able to reproduce the crashes locally?

So far I cannot on my own dev machines. I'll take the buildbot offline and try to see if I can figure out what is going on this weekend.

dyung added a comment.Mar 17 2023, 9:25 PM

@dyung I cannot reproduce this issue on two different machines in both cases tests work fine. For clang and llvm-jitlink binaries (the only binaries used in the tests) my changes are non-functional and I didn't change anything in compiler-rt or ORC. Are you able to reproduce the crashes locally?

So far I cannot on my own dev machines. I'll take the buildbot offline and try to see if I can figure out what is going on this weekend.

So after deleting the build directory and clearing the ccache, the problem no longer seems to occur. I'm guessing it might have been corruption somewhere. Sorry about that!

mgorny added a subscriber: mgorny.Mar 24 2023, 10:14 AM

This commit broke building against libclang-cpp.so dylib. I've submitted a fix as D146427 but it haven't received any attention so far.