Page MenuHomePhabricator

[clangd] Add path mappings functionality

Authored by wwagner19 on Jul 7 2019, 8:18 PM.



Add path mappings to clangd which translate file URIs on inbound and outbound LSP messages. This mapping allows clangd to run in a remote environment (e.g. docker), where the source files and dependencies may be at different locations than the host. See for more.

Diff Detail

Event Timeline

wwagner19 created this revision.Jul 7 2019, 8:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2019, 8:18 PM
wwagner19 marked 2 inline comments as done.Jul 7 2019, 8:32 PM


This is my first proposed change to LLVM, so sorry if I messed anything up. The proposed changes here follow from discussion on clangd-dev (from janruary and from june ).
It seems like a rather large one, but fear not, most of the code is simply tests and wrapper code.

Happy to hear any feedback, thanks!


Ideally this wouldn't be in the public interface, but I wanted to unit test it and wasn't sure of a way to do that cleanly - other than putting it in the header.


Comma separated list here obviously limits the path mapping file paths, but there was precedent for this already (in --QueryDriverGlobs) and it seemed simplest.

Also,a command-line argument felt the most straightforward, as I'm not aware of any clangd project settings file (please lmk if there is one :) ). Users can set up custom path mappings by using e.g. vscode workspace settings.json, coc.nvim coc-settings.json

Thanks for putting this together. Overall it looks pretty good!

Main issues to address are:

  • file path handling should manage windows paths and have tests for this
    • the lit test can be simplified quite a lot I think

V.kind() is an enum, using const auto& here is confusing. It's just Kind?


object keys may be file uris too. (see WorkspaceEdit.changes)

This case is going to be a bit annoying to handle, I think :-(


not a really big deal, but we're forcing a copy here - should doPathMapping mutate its argument and return void instead?


I'm not sure why this needs to be an error


will an absolute windows path C:\foo be treated as absolute on a unix machine?
(I know the reverse is true)


Please leave this to the caller to log, if necessary.

(I think we're likely to start logging argv/config on startup, so in general having every flag logged separately probably isn't desirable)


again, copying every string in the json seems excessive. can we return optional<string> and only populate it when we match?


auto here is just string, I think?


This is simplistic I'm afraid: it's not going to work at all on windows (where paths have \ but URIs have /), and it's going to falsely match "/foo-bar/baz" against a mapping for "/foo".

llvm::sys::fs::replace_path_prefix is probably the beginning of a solution. If I'm reading correctly the first two args need to have the same slash style and OldPrefix should have its trailing slash.

I'd actually suggest pulling out the function to map one string, and unit-testing that, to catch all the filename cases.

Then the json::Value traversal tests should be focused on testing the places where a string can appear in JSON, not all the different contents of the string.


This is too verbose for vlog. If you think it's important to keep, it should be dlog.


hmm, the host/remote terminology is a bit confusing to me.
I guess the host is the machine where clangd is running, and remote is the other end, but from the user's point of view this is a configuration where the clangd is remote.

What about calling these client/server paths? The language client/server roles are defined in the spec and these terms are likely to make some sense to users.


Because there's room for confusion, I'd prefer a struct with named fields over pair<string, string>


I'd suggest = as a delimiter instead, it's more evocative and more common.

The order is tricky, I suspect /client/path=/server/path is going to be more intuitive


This seems fine. Alternatively you could implement a dummy transport and wrap it, but it's probably too much work to justify

26 ↗(On Diff #208306)

This test seems to have unneccesary moving parts. Was it copied from the background index test?

One header file on disk and one draft file sent over RPC should be enough. A compilation database shouldn't be necessary I think.

You shouldn't need to splice URIs, because the input and output paths are virtual and fully under your control (that's the point of this patch, :)). So the test should be able to run on windows.

I think this test can actually be a standard one where the JSON-RPC calls and assertions go in the *.test file.

47 ↗(On Diff #208306)

This is unfortunately not enough context to be a precise assertion here - that URI could appear in the output for other reasons. See the other *.test for how to make a tighter assertion on the output (Mostly matching the JSONRPC id and then CHECK-NEXT).

And the assertion shouldn't need a wildcard in it, the client path is under your control.

47 ↗(On Diff #208306)

it'd be good to check some of the other output e.g. filename from the diagnostics notification.


the client side of the mapping doesn't need to exist at all on clangd's side, so don't put %t in it.

I'd suggest making the client a windows path, especially since we're mostly developing on unix. This will cover more of the interesting cases.


This seems fine, though it is a bit odd to do half the parsing (split on comma) here, and half elsewhere. I'd suggest just making this a cl::opt<string> and passing the whole string to the parse function.


Windows paths need to be specifiable as `C:\foo\bar".

We should have some tests for each combination of client/server style, I think. (Though if we need an #ifdef because clangd only handles "native" server paths, that's fine).

wwagner19 marked 9 inline comments as done.Jul 8 2019, 8:36 PM

Thanks for all the feedback, Sam! I'll try and get an updated patch up sometime tomorrow.


Indeed, it seems so. It doesn't look like json::Object has any key removal functionality (although I could very well be reading this wrong). If so, then I guess I can just create a new json::Object, moving over the old values, and replacing the keys if necessary.


yup makes sense!


Ah yea good catch, this will be a bit more tricky then. I was originally just imagining the users using strictly URI syntax in the --path-mappings, but that's doesn't seem very friendly in hindsight. So just to clarify, we should:

  1. Allow the users to specify windows style paths (e.g. C:\foo) and posix style paths
  2. Allow the inter-op of both, i.e. --path-mappings="C:\foo=/bar"

IIUC, file URIs will always have the forward-slash syntax, so this may require storing the windows-style path mapping in forward-slash style. I can try and get this going tomorrow. Although, one tricky thing might be trying to figure out if a path is indeed windows-style (in a unix environment where _WIN32 isn't defined).


Agreed, sounds better


Way better :)

26 ↗(On Diff #208306)

This was copied from the background test, I felt a bit uneasy about how complicated it got, but I had a bit of trouble getting a simpler one going. You're right though, I can't see why this wouldn't work with a "remote" file on disk, and having the draft RPC file simply include that, i'll have another go at it.

wwagner19 updated this revision to Diff 227643.Nov 3 2019, 7:19 PM

Unfortunately, I had to take a bit of a hiatus there, but i'm back a few months later with an updated patch incorporating all of @sammccall 's feedback! Notably,

  • Windows paths are now accounted for, basically we first try to parse a unix path, and fall back to windows if possible. After, windows paths are converted to forward-slash notation, so the prefix stuff can work.
  • Mapping LSP jsonrpc keys is now done, albeit a bit awkward due to no delete key/value API
  • The lit test is improved, as it no longer relies on background index and tests windows client path

As for the validity of this feature, I am well aware of vscode's remote feature, but it is still essential for vim/emacs/etc. editors, IMO.

Please take a look when you have a chance, thanks.

wwagner19 marked 13 inline comments as done.Nov 4 2019, 5:07 AM

Thanks, this looks a lot better.
Main thing that was unclear to me is the fact that the paths in the mappings are now URI-paths, so "/C:/foo" on windows.

This looks like the right idea as it ensures much of the path-mapping code gets to ignore slashes and such.
Docs/naming could reflect it a bit better.


Sorry, you're right - I'll fix json::Object.
Nevertheless I think the copy-into-new-object approach is the clearest way to deal with renames of keys that may otherwise collide in the intermediate state.


by value, not by ref


this isn't a doxygen comment, please omit \p


"posix-style" doesn't really describe representing c:\foo as /c:/foo. That's really *just* a URI thing AFAIK.

Something like "Converts a unix/windows path to the path part of a file URI".
But in that case, I think the implementation is just URI::createFile(Path).body(). Does that pass tests?


nit: add a comment like "bail out quickly in the common case"? to make it clear this is (only) a performance optimization


you need to consume the error, or it'll assert



Sorry, I know I suggested replace_path_prefix, but now that the mappings consist of paths in their URL form, I think the old string::replace version is what you want :-/


add a comment: Therefore, both paths are represented as in file URI bodies, e.g. /etc/passwd or /C:/config.sys


nit: you can probably drop these constructors and use {} aggregate initialization, up to you.


nit: please \p RawPathMappings with a space, or drop the doxygen tag entirely. These are read more often in the code than in rendered documentation, and \pFoo is hard to read.


nit: the sense of the bool is pretty arbitrary here, prefer a two value enum?

e.g. enum class PathMapping::Direction { ServerToClient, ClientToServer }

(Reusing the "server" and "client" concept rather than adding "incoming/outgoing" seems a little simpler, though up to you)


nit: please write the mapping as C:\ - it's probably redundant with the tests above, but it's clearer that it's on purpose


The server path here isn't unix-style - that would be /foo/ or maybe C:/foo (windows path with unix-style slashes).

You could call this URI style, but I don't think this is something we'd want to (deliberately) support and we shouldn't have tests for it.

(also welcome back and thanks for picking this up!)

wwagner19 updated this revision to Diff 229721.Nov 17 2019, 8:42 AM
wwagner19 marked 5 inline comments as done.

Thanks for the second review Sam, I addressed most of your comments, notably:

  • Changed the bool IsIncoming to an enum
  • Fixed the "doxygen" comments,
  • Removed some redundant incudes/variables
  • Switched replace_path_prefix to string replace
wwagner19 marked 9 inline comments as done and an inline comment as not done.Nov 17 2019, 8:51 AM
wwagner19 added inline comments.

Oh I did not realize createFile was a thing, however looking at the way it's implemented now, won't that throw an assert if a non-absolute path is passed in? If so, is that desirable at all?

IIUC, if I were to use that API, then wouldn't it make more sense for it to return an llvm::Expected? If we want to consolidate the logic to one place, I'd be happy to try and refactor the signature.


Will do, I like string replace better anyway!

I'm not a huge fan of the mappingMatches function, and would prefer a simple string startswith(from), but the only way I may see that working is by appending "/" to all the path mappings internally - which would prevent /foo matching /foo-bar - but appending a "/" would break directory-based file URIs, I believe.


much much better that way, thanks

sammccall accepted this revision.Nov 18 2019, 12:34 AM

LG. A couple of optional comments to address as you see fit.

It looks like you're not an LLVM committer yet, do you want me to commit this for you?
(Usually it's reasonable to ask for this after landing a couple of patches:


I think allowing relative paths to be specified in path mappings is needlessly confusing. Clangd flags are typically specified in a config file somewhere, and that config often doesn't know about the working directory.

We don't want to hit the assert, so I'd suggest testing if it's absolute first, and returning an error if not.


Yeah, I see what you mean.
My favorite way to structure this code would probably be to always strip trailing slashes from the mappings, and combine the check-for-match and apply-match for a single entry:

optional<std::string> doPathMapping(StringRef Path, StringRef From, StringRef To) {
  if (Path.consume_front(From) && (Path.empty() || Path.startswith('/'))
    return (To + Path).str()
  return None;

Totally up to you, your tests look good :-)

This revision is now accepted and ready to land.Nov 18 2019, 12:34 AM
wwagner19 updated this revision to Diff 229969.Nov 18 2019, 9:01 PM
wwagner19 marked 2 inline comments as done.

Awesome! I am not an LLVM committer, so if you could commit on my behalf that'd be great- although I'm not sure how LLVM handles squashing/merging, continuous integration, etc., so please let me know if I need to do anything else (aside from the code of course).

Once again, thanks for all the help - I learned a lot!

wwagner19 marked 2 inline comments as done.Nov 18 2019, 9:09 PM
wwagner19 added inline comments.

So even with checking for absolute paths before calling createFile, it still won't work quite right. Currently, createFile, and consequently FileSystemScheme().uriFromAbsolutePath(AbsolutePath), converts paths differently depending on the environment Clangd is running on (via WIN32 or some other means).

e.g. if we had mapping C:\home=/workarea and Clangd built/running on linux, then the replace_path_prefix by default would use posix style, which won't replace the \. This may not be too useful in practice, but it's a small price to pay for windows-unix-interop, I feel.

Hey @sammccall, any update here?

Sorry, I was out for a bit and this fell off my radar.
It looks good, but the last snapshot seems to have reverted some earlier changes (book vs enum, some doxygen comments etc). Do you know what's up there?

Either way I should be able to apply this on Monday

wwagner19 updated this revision to Diff 236293.Jan 5 2020, 9:37 PM

The last diff was broken, this most recent one

  • Changes IsIncoming boolean to an enum
  • Refactors the matching path logic
wwagner19 updated this revision to Diff 236295.Jan 5 2020, 9:48 PM

To be honest, I'm not sure how to remedy this. So I just rebased all my commits into one and dropped the git show HEAD -U999999 into here.

Please let me know if you need me to fix anything / open a new diff.

Thanks! The latest snapshot looks good. Landing it now with a few minor tweaks mentioned below.

(And trivial local style things, we generally prefer to drop braces on simple if statements etc.)


Removed the MapperFunc argument here as it's always doPathMapping.

Then this is just applyPathMappings, so merged the two.


changed to client=server to match the flag syntax


I extended this doc a bit to clarify what "client" and "server" paths mean, and explain the first-match-wins.
(I don't think the example reflects first-match-wins, so I reversed the order)


changed to elog("{0}", Mappings.takeError());

This revision was automatically updated to reflect the committed changes.