This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Use URIs in index symbols.
ClosedPublic

Authored by ioeric on Feb 5 2018, 9:28 AM.

Event Timeline

ioeric created this revision.Feb 5 2018, 9:28 AM

This looks OK so far, where is it going? It doesn't make sense to put URIs in if we only ever use file:.
I guess others will be produced by some kind of extension point on SymbolCollector. That will specify URI schemes we should try, allow you to replace the whole toFileURI, or something else?

Unfortunately there's a bunch of Uris here, where the existing code uses URI...

clangd/index/Index.h
27

nit: FileURI?
(The other style is OK too, though I personally find it harder to read. But the class is URI and we should be consistent)

clangd/index/SymbolCollector.cpp
28

this combines

or better, "resolves relative paths against \p FallbackDir"

33–34

also URI here, and below

204

while here, would you mind changing GetSymbolLocation -> getSymbolLocation?

clangd/index/SymbolYAML.cpp
51

more URI

ioeric updated this revision to Diff 132958.Feb 6 2018, 2:13 AM
ioeric marked 2 inline comments as done.
  • Make URIScheme customizable in SymbolCollector.
ioeric added inline comments.Feb 6 2018, 2:13 AM
clangd/index/Index.h
27

I tend to use Uri for URI strings and URI/U for URI objects. But I'd be okay with either, if you prefer URI.

ioeric added a comment.Feb 6 2018, 2:17 AM

I was thinking about leaving URI scheme customization to the postprocessing phase, but you are right, it would be better to make the URI scheme extendable in SymbolCollector.

Great, this all makes sense. I think we can/should make the scheme selection a bit more robust (we shouldn't crash if we get unexpected filenames).
And... Uri or URI (I really think this is a usability issue - i had a scarring experience with a codebase that couldn't decide how to spell abbreviations...)

clangd/index/Index.h
27

I'm fine with either spelling, but not really with mixing the two.

Is there some precedent for using different capitalization depending on the variable type? That seems... non obvious to me.

clangd/index/SymbolCollector.cpp
67

this doesn't seem unreachable?

clangd/index/SymbolCollector.h
38

We should document the semantics/exceptions in case of failure.
Do we drop the symbol, or drop the location, or fall back to file URI?

One scheme that would be flexible and I think pretty simple:

  • make this a vector of schemes that will be tried in order
  • drop the location (but not symbol) if it's not possible to encode the path
  • note that the vector should probably end in "file" as a fallback, and make {"file"} the default

For our monorepo we can set this to {"google", "file"} or just {"google"} if we never want to leak local paths. It seems pretty plausible that other orgs whose builds aren't totally hermetic would want several schemes here.

unittests/clangd/SymbolCollectorTests.cpp
49

while renaming this, maybe Decl is better? we'll have Def soon!

195

Can you add a test for the case when the path fails to convert to the URI scheme?

hokein added inline comments.Feb 6 2018, 2:46 AM
clangd/index/Index.h
27

I'm +1 on URI, using all capital letters for initialisms can make the name a bit easier to read. Let's be consistent using URI?

ioeric added inline comments.Feb 6 2018, 2:52 AM
clangd/index/Index.h
27

I'll change to URI. Another reason why I avoided using URI as variable names was the name conflicts with the actual URI class... I always need to come up with a prefix...

ioeric updated this revision to Diff 133000.Feb 6 2018, 7:32 AM
ioeric marked 4 inline comments as done.Feb 6 2018, 7:32 AM
sammccall accepted this revision.Feb 6 2018, 7:57 AM
sammccall added inline comments.
unittests/clangd/SymbolCollectorTests.cpp
259–266

leftover logs

This revision is now accepted and ready to land.Feb 6 2018, 7:57 AM
ioeric updated this revision to Diff 133009.Feb 6 2018, 8:09 AM
This revision was automatically updated to reflect the committed changes.