This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Use all inputs to SystemIncludeExtractor in cache key

Authored by kadircet on Mar 27 2023, 3:32 AM.



Instead of passing in a tooling::CompileCommand into system include
extraction, pass a limited set, whose elements are used as keys.

Also fix the issue around accepting -isysroot=/foo which isn't a valid
argument (or the directory should be =/foo not /foo).


This should also unblock

Diff Detail

Event Timeline

kadircet created this revision.Mar 27 2023, 3:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 27 2023, 3:32 AM
Herald added a subscriber: arphaman. · View Herald Transcript
kadircet requested review of this revision.Mar 27 2023, 3:32 AM
nridge added inline comments.Apr 3 2023, 12:57 AM

It looks like the purpose of including Directory in the key is to use it in make_absolute in case where Driver is not an absolute path.

I wonder if it would make sense to perform that make_absolute operation eagerly and store its result in the key instead.

I'm thinking of a scenario where a project has say 10,000 source files in 500 directories, and in the compile_commands.json entries the directory is the directory immediately containing the source file (or a corresponding directory in a build/ directory hierarchy). In such cases, having the directory be in the key would mean executing the driver 500 times instead of just once (assuming no other properties are different) during indexing.


The previous behaviour was to abort system include extraction in this case.

Assuming you're not looking to change that behaviour, one way to preserve it could be to have render() check for an empty Lang and return an empty vector in that case, and also have extractSystemIncludesAndTarget() check for render() returning an empty vector and return an empty optional in that case.


(did you mean to add -dump-input=fail here? according to FileCheck docs that's the default)

kadircet marked 3 inline comments as done.Apr 12 2023, 11:43 PM
kadircet added inline comments.

yeah makes sense. i was mostly trying to avoid the extra string operations, with the assumption of "directory" being ~always the same across entries. but we're already paying for those string operations today, so not saving them won't be a regression and being more resilient for such cases in the future wouldn't hurt.


thanks! definitely didn't notice that, I am not sure how valid of a concern this is in practice. as we'll only use the cached result for translation units whose language cannot be determined (I'd claim they all belong to same category). but leaving it as-is in this patch with a fixme (happy to drop the fixme if you think that's actually important). made it similar to existing code pattern, i.e. bailing out early if language is empty.

kadircet updated this revision to Diff 513070.Apr 12 2023, 11:43 PM
kadircet marked 2 inline comments as done.
  • Address comments
nridge accepted this revision.Apr 13 2023, 1:06 AM


Do you have any thoughts on merging D147905 first, with a view to backporting D147905 to the 16.x branch?


nit: typo ("prgoram")

This revision is now accepted and ready to land.Apr 13 2023, 1:06 AM
kadircet updated this revision to Diff 513098.Apr 13 2023, 1:19 AM
kadircet marked an inline comment as done.

Fix typo

Do you have any thoughts on merging D147905 first, with a view to backporting D147905 to the 16.x branch?

Since there are ongoing discussions about D147905, I don't want to hold this patch up -- please go ahead and merge it, and I will rebase that one as needed once we agree on an approach there.