[clangd] Assert path is absolute when assigning to URIForFile.
ClosedPublic

Authored by ilya-biryukov on Tue, Feb 13, 9:57 AM.

Details

Summary

The assertion will point directly to misbehaving code, so that
debugging related problems (like the one fixed by r325029) is easier.

Diff Detail

ilya-biryukov created this revision.Tue, Feb 13, 9:57 AM

I think another option to prevent the bug in r325029 from happening would be providing a canonical approach for retrieving (absolute) paths from FileManager. We already have code in symbol collector that does this, and we have similar code in XRefs.cpp now. We should probably move them to clangd/Path.h and share the code?

sammccall added inline comments.Tue, Feb 13, 11:44 AM
clangd/Protocol.h
51

I don't like how the API changes here take us further away from the other structs in this file.
Why does this one enforce invariants, when the others don't?

If we do want to make this more "safe", I'd suggest making it look more like a URI string (since that's what its protocol role is), like:

class LSPURI {
  LSPURI(StringRef URI) { assert if bad }
  static LSPURI ForFile(StringRef AbsPath) { assert if bad }
  operator string() { return URI::createFile(File).toString(); }
  StringRef file() { return AbsFile; }
  operator bool() { return !File.empty(); } 
  // move/copy constructor/assignment defaulted

 private:
  string AbsFile;
}

But the approach Eric suggested does seem promising: more consistent with how we handle invariants in protocol structs.

  • Remove URIForFile::setFile(), rely on copy and move constructors instead.

I think another option to prevent the bug in r325029 from happening would be providing a canonical approach for retrieving (absolute) paths from FileManager. We already have code in symbol collector that does this, and we have similar code in XRefs.cpp now. We should probably move them to clangd/Path.h and share the code?

I agree with you, we should do this.
But this change is not a mitigation for r325029 specifically. URIForFile would still be tremendously easy to misuse in other cases as well.

clangd/Protocol.h
51

I've updated URIForFile to follow the interface you proposed. (Instead of operator string() we have uri(), but everything else is there). Haven't changed the name yet, happy to do it to.

Why does this one enforce invariants, when the others don't?

It seems useful to catch misbehaving code as early as possible. What are the other invariants that could be useful?

But the approach Eric suggested does seem promising: more consistent with how we handle invariants in protocol structs.

I agree with Eric's approach, I wasn't trying to fix that specific bug with this commit.

I'm trying to argue that we should check for non-absolute paths as early as possible and fail with assertions to make writing code easier in the future. Do you agree that would be useful or make we shouldn't do it?

ioeric added inline comments.Wed, Feb 14, 4:43 AM
clangd/Protocol.h
51

I agree that this change would help debugging, so I don't really have objection. I'll leave approval to Sam though, as he had some suggestions.

I wonder if we want to make the assertion more aggressive. Currently, this only helps with assertion on. Would it make sense to also check this in production code and provide useful error message so that it would be easier for us to spot the problem when users report a bug?

56

nit: I like file() that Sam suggested a bit more.

sammccall accepted this revision.Thu, Feb 15, 10:14 AM

LG if we want to do this
(please getFile -> file though!)

clangd/Protocol.h
28

can we hold off on adding Path to more places in this patch?
I'd like to see if others find it useful - it mostly seems obscure to me.

51

I'm not going to block this change, but I somewhat prefer the current code without it (and assertions in the places that populate URIForFile instead).

It seems useful to catch misbehaving code as early as possible. What are the other invariants that could be useful?

At a quick scan of protocol.h from the top:

Position.line/character >= 0
Range.start <= Range.end
Location.uri non-empty
TextDocumentItem.uri non empty (and lots of other uris)
FormattingOptions.tabSize >= 0
Diagnostic.message non-empty
ExecuteCommandParams.command has a valid value
...

This revision is now accepted and ready to land.Thu, Feb 15, 10:14 AM
ilya-biryukov marked an inline comment as done.Fri, Feb 16, 4:20 AM
ilya-biryukov added inline comments.
clangd/Protocol.h
28

Done.

I was the one who introduced them and I (obviously) like them. I find the code using them a bit easier to read, they don't add any type-safety, though.
Let's discuss whether others find them useful. If I'm the only one who feels this way, we can remove them.

51

After looking at the code, all instances of URIForFile are created with the assertion enforced, other structs you mention can be created in arbitrary state by fromJSON functions.
I'm not sure if making fromJSON functions more complicated are a good trade-off for enforcing those assertions.

URIForFile is different from other structs, it already has logic to convert file paths to uri (and back indirectly, via fromJSON function). Others merely directly capture the information from the LSP spec.

56

Done.

Note that this technically breaks LLVM Style Guide: "Function names should be verb phrases".
But I think we're breaking it in multiple places already (including here, e.g. uri()).

ilya-biryukov marked an inline comment as done.
  • Rename getFile() to file()
  • Removed the Path.h include
  • Rebased onto head
This revision was automatically updated to reflect the committed changes.