This is an archive of the discontinued LLVM Phabricator instance.

Add a prototype for clangd v0.1
ClosedPublic

Authored by bkramer on Feb 2 2017, 6:33 AM.

Details

Summary

clangd is a language server protocol implementation based on clang. It's
supposed to provide editor integration while not suffering from the
confined ABI of libclang.

This implementation is limited to the bare minimum functionality of
doing (whole-document) formatting and rangeFormatting. The JSON parsing
is based on LLVM's YAMLParser but yet most of the code of clangd is
currently dealing with JSON serialization and deserialization.

This was only tested with VS Code so far, mileage with other LSP clients
may vary.

Diff Detail

Repository
rL LLVM

Event Timeline

bkramer created this revision.Feb 2 2017, 6:33 AM
krasimir accepted this revision.Feb 2 2017, 6:41 AM
krasimir added a subscriber: krasimir.

looks good!

This revision is now accepted and ready to land.Feb 2 2017, 6:41 AM
djasper accepted this revision.Feb 3 2017, 3:26 AM
djasper added a subscriber: djasper.

Just a few nits. I think this looks like a great starting point!

clangd/ClangDMain.cpp
33 ↗(On Diff #86800)

Maybe add "Implement" :)

52 ↗(On Diff #86800)

Nit: string also has "empty()", right? Maybe only convert to StringRef after this check.

62 ↗(On Diff #86800)

Should you somehow actually read them and assert that they are "\r\n"?

74 ↗(On Diff #86800)

So you currently always print this to std::err.. Should you guard this in DEBUG or something?

clangd/DocumentStore.h
25 ↗(On Diff #86800)

I'd call this addDocument for symmetry with removeDocument. The fact that you phrase the comment that way, also make me think that this will be the more intuitive name.

clangd/JSONRPCDispatcher.cpp
89 ↗(On Diff #86800)

I'd pull out this case and move it to the front to show that this is a guaranteed early exit and the function can never fall through to the following code. But I am not sure.

clangd/ProtocolHandlers.h
27 ↗(On Diff #86800)

In the long run, we probably don't want to specify all of these in the same file, but for now this seems fine.

test/clangd/formatting.txt
9 ↗(On Diff #86800)

How did you come up with this indentation?

Adi added a subscriber: Adi.Feb 3 2017, 3:37 AM

I had some spare time ... I hope you don't mind the comments. Hopefully you will find something useful :)

clangd/ClangDMain.cpp
65–70 ↗(On Diff #86800)

Avoid unnecessary JSON.resize(Len) & potential reallocation during JSON.push_back('\0') by allocating enough space in advance: std::vector<char> JSON(Len + 1);

73–77 ↗(On Diff #86800)

Since llvm::StringRef does not own the data, I think it would make more sense to do the logging after Dispatcher.call().

80 ↗(On Diff #86800)

You are building llvm::StringRef without the trailing null byte here, yet comment above states that YAML parser requires a null byte.

80 ↗(On Diff #86800)

Perhaps make a log entry if Dispatcher.call() failed.

clangd/JSONRPCDispatcher.cpp
32–40 ↗(On Diff #86800)

I would extract this implementation to the separate handler class (e.g. OperationNotSupportedHandler) and make this an interface class. It would make code and intent more explicit.

82–83 ↗(On Diff #86800)

dyn_cast might fail and leave you with Version set to nullptr. Using static_cast is better approach if you are sure the cast will be successful.

112–120 ↗(On Diff #86800)

Perhaps refactor this code and the one above into the separate method to gain some more readability.

clangd/JSONRPCDispatcher.h
25–32 ↗(On Diff #86800)

To avoid virtual dispatch one may rewrite this class in terms of CRTP. E.g.

template <typename T>
class Handler {
public:
    Handler(llvm::raw_ostream &Outs, llvm::raw_ostream &Logs)
      : Outs(Outs), Logs(Logs) {}
    virtual ~Handler() = default;

    void handleMethod(const llvm::yaml::MappingMode *Params, StringRef ID) { 
        static_cast<T*>(this)->handleMethod(Params, ID); 
    }
    void handleNotification(const llvm::yaml::MappingMode *Params) { 
        static_cast<T*>(this)->handleNotification(Params); 
    }

protected:
    llvm::raw_ostream &Outs;
    llvm::raw_ostream &Logs;

    void writeMessage(const Twine &Message);    
};

And then use it as:

struct MyConcreteHandler : public Handler<MyConcreteHandler> {
    MyConcreteHandler(llvm::raw_ostream &Outs, llvm::raw_ostream &Logs /* other params if necessary */) 
      : Handler(Outs, Logs) /* init other members */
    {}
 
    void handleMethod(const llvm::yaml::MappingMode *Params, StringRef ID) { 
        // impl
    }
    void handleNotification(const llvm::yaml::MappingMode *Params) { 
        // impl 
    }
};
29 ↗(On Diff #86800)

const ptr/ref to Params?

32 ↗(On Diff #86800)

const ptr/ref to Params?

clangd/Protocol.cpp
11 ↗(On Diff #86800)

To some degree it could be refactored. I would go for CRTP again here. It would simplify the code a little bit. E.g.

template <typename T>
struct Params {
    static llvm::Optional<T> parse(const llvm::yaml::MappingNode& node) {
        T Result;
        for (auto& NextKeyValue : node) {
            auto *KeyString = dyn_cast<llvm::yaml::ScalarNode>(NextKeyValue.getKey());
            if (!KeyString)
                return llvm::None;
            
            llvm::SmallString<10> KeyStorage;
            StringRef KeyValue = KeyString->getValue(KeyStorage);
            auto *Value = dyn_cast_or_null<llvm::yaml::ScalarNode>(NextKeyValue.getValue());
            if (!Value)
                return llvm::None;

            Result = static_cast<T*>(this)->setParam(KeyValue, *Value); // or make Result an argument if worried about RVO
            if (!Result) // we can make this more neat by placing it inside a for loop condition (ranged-for doesn't allow this)
                return llvm::none;
        }
        return Result;
    }
};

struct TextDocumentIdentifier : public Params<TextDocumentIdentifier> {
    /// The text document's URI.
    std::string uri;

    llvm::Optional<TextDocumentIdentifier> setParam(StringRef KeyValue, const llvm::yaml::ScalarNode& Value) {
        TextDocumentIdentifier Result;
        llvm::SmallString<10> Storage;
        if (KeyValue == "uri") {
            Result.uri = Value.getValue(Storage);
        } else if (KeyValue == "version") {
            // FIXME: parse version, but only for VersionedTextDocumentIdentifiers.
        } else {
            return llvm::None;
        }
        return Result;
    }
};

struct Position : public Params<Position> {
    /// Line position in a document (zero-based).
    int line;

    /// Character offset on a line in a document (zero-based).
    int character;

    llvm::Optional<Position> setParam(StringRef KeyValue, const llvm::yaml::ScalarNode& Value) {
        Position Result;
        llvm::SmallString<10> Storage;
        if (KeyValue == "line") {
            long long Val;
            if (llvm::getAsSignedInteger(Value.getValue(Storage), 0, Val))
                return llvm::None;
            Result.line = Val;
        } else if (KeyValue == "character") {
            long long Val;
            if (llvm::getAsSignedInteger(Value.getValue(Storage), 0, Val))
                return llvm::None;
            Result.character = Val;
        } else {
            return llvm::None;
        }
        return Result;
    }
};

I am not really familiar with all of the dependencies here but I may suggest that one might also try to generalize this approach even further by encapsulating the handling of KeyValue into the common implementation. E.g.

template <typename T>
T fromKeyValueToParam(StringRef KeyValue, const llvm::yaml::ScalarNode& Value) {
    T Result;
    llvm::SmallString<10> Storage;

    if (KeyValue == "line")
        long long Val;
        if (llvm::getAsSignedInteger(Value.getValue(Storage), 0, Val))
            return llvm::None;
        Result.line = Val;
    } else if (KeyValue == "character") {
        long long Val;
        if (llvm::getAsSignedInteger(Value.getValue(Storage), 0, Val))
            return llvm::None;
        Result.character = Val;
    }
    else if (KeyValue == "textDocument") {
        auto Parsed = TextDocumentIdentifier::parse(Value);
        if (!Parsed)
            return llvm::None;
        Result.textDocument = std::move(*Parsed);
    }
    else if (KeyValue == "options") {
        auto Parsed = FormattingOptions::parse(Value);
        if (!Parsed)
            return llvm::None;
        Result.options = std::move(*Parsed);else if (KeyValue == "range")
    }
    else if (KeyValue == "range") {
        auto Parsed = Range::parse(Value);
        if (!Parsed)
            return llvm::None;
        Result.range = std::move(*Parsed);
    }
    else if ( ... ) {
        ...
    }
    
    ...

    return Result;
}

Then you would be able to handle everything in Params<T>::parse() without any other specialized code required to be written in concrete parameters. Code could look something like this:

template <typename T>
struct Params {
    static llvm::Optional<T> parse(const llvm::yaml::MappingNode& Params) {
        T Result;
        for (auto& NextKeyValue : Params) {
            auto *KeyString = dyn_cast<llvm::yaml::ScalarNode>(NextKeyValue.getKey());
            if (!KeyString)
                return llvm::None;
            
            llvm::SmallString<10> KeyStorage;
            StringRef KeyValue = KeyString->getValue(KeyStorage);
            auto *Value = dyn_cast_or_null<llvm::yaml::ScalarNode>(NextKeyValue.getValue());
            if (!Value)
                return llvm::None;

            Result = fromKeyValueToParam(KeyValue, *Value);
            if (!Result) // we can make this more neat by placing it inside a for loop condition (ranged-for doesn't allow us that)
                return llvm::none;
        }
        return Result;
    }
};

struct Position : public Params<Position> {
    /// Line position in a document (zero-based).
    int line;

    /// Character offset on a line in a document (zero-based).
    int character;
};

struct TextDocumentIdentifier : public Params<TextDocumentIdentifier> {
    /// The text document's URI.
    std::string uri;
};
klimek accepted this revision.Feb 3 2017, 3:53 AM

LG. Couple of questions.

clangd/ClangDMain.cpp
65 ↗(On Diff #86800)

Shouldn't that be unsigned char for raw bytes?

clangd/JSONRPCDispatcher.h
29 ↗(On Diff #86800)

Here and below, document what the default implementations do.

29–32 ↗(On Diff #86800)

Can we make those Params pointers-to-const?

klimek added inline comments.Feb 3 2017, 3:55 AM
clangd/JSONRPCDispatcher.h
25–32 ↗(On Diff #86800)

To avoid virtual dispatch one may rewrite this class in terms of CRTP. E.g.

Why would virtual dispatch be a problem here? It seems strictly simpler this way.

Adi added inline comments.Feb 3 2017, 4:35 AM
clangd/JSONRPCDispatcher.h
25–32 ↗(On Diff #86800)

I've mentioned it only as an alternative approach. I was not implying that virtual dispatch is a problem. It's a matter of taste in this case really.

clangd/Protocol.cpp
11 ↗(On Diff #86800)

fromKeyValueToParam() will of course not compile but you've got the idea. I think it can be achieved in this way somehow.

This might be a bad question, but is there any particular reason why you didn't use the YAML Traits API for parsing instead of the raw YAML Stream API? In my experience the traits based API is quite convenient for structured data like various options and parameters in this protocol.

bkramer marked 19 inline comments as done.Feb 6 2017, 7:08 AM

This might be a bad question, but is there any particular reason why you didn't use the YAML Traits API for parsing instead of the raw YAML Stream API? In my experience the traits based API is quite convenient for structured data like various options and parameters in this protocol.

It's an excellent question. I actually tried to make YAMLTraits work for this, but it gets annoying due to the dynamic parts of JSONRPC. You have to first parse the method string and then dispatch based on that string. This is not something that the YAMLTraits interface supports. I think it could be extended to allow mixing code written against YAMLParser with YAMLTraits code and then we can clean up the manual code here. Before that happens I want the YAMLParser implementation working, because there are some complicated union types in the Language Server Protocol that might require additional tweaks to the YAML infrastructure.

clangd/ClangDMain.cpp
52 ↗(On Diff #86800)

But string doesn't have trim(). An "empty" line still has a \n, which I want to discard.

65 ↗(On Diff #86800)

I agree, but that would add pointer punning on every usage, as iostreams and YAML parser want normal chars.

73–77 ↗(On Diff #86800)

But that will confuse the logging. You'd get the query after the answer.

74 ↗(On Diff #86800)

My goal is to wire this up to the editor in some way and make it possible to discard the log with a command line flag (send it to nulls()). I think using DEBUG only makes that more complicated.

That said, the logging parts of clangd probably need a redesign at some point ;)

80 ↗(On Diff #86800)

This is intentional. YAML parser treats the pointer inside of the StringRef as a null terminated string, reading past the end of the StringRef.

clangd/DocumentStore.h
25 ↗(On Diff #86800)

I picked set because it can overwrite contents and that's how it used. Your comment makes sense though so I changed it back to add.

clangd/JSONRPCDispatcher.cpp
32–40 ↗(On Diff #86800)

The goal of the default methods is to make it easy to implement only one of the two methods. If I want to do that with an OperationNotSupportedHandler I'd have to inherit from that, which is more awkward.

82–83 ↗(On Diff #86800)

It's supposed to not crash and return an error. This is user input. Added the missing nullptr check.

89 ↗(On Diff #86800)

Not sure if I understand. It's vital that this part is called when we just read params, otherwise the YAMLParser will crash.

clangd/JSONRPCDispatcher.h
29 ↗(On Diff #86800)

YAML nodes are supposed to be immutable and don't provide const members.

29–32 ↗(On Diff #86800)

No. YAML nodes are supposed to be immutable and don't provide const members.

clangd/Protocol.cpp
11 ↗(On Diff #86800)

I don't think this really helps. The most annoying repetitive part is the walk over the fields, which is still there.

In the long term I want to replace this with a YAMLTraits-like thing, but that will take a while.

test/clangd/formatting.txt
9 ↗(On Diff #86800)

It's hardcoded as a raw string literal in the source. It looks better there ;)

bkramer updated this revision to Diff 87226.Feb 6 2017, 7:08 AM
bkramer marked 8 inline comments as done.

Address review comments. Make test actually run (missing cmake file)

This revision was automatically updated to reflect the committed changes.
krasimir added inline comments.Feb 7 2017, 3:02 AM
test/clangd/formatting.txt
9 ↗(On Diff #86800)

Maybe you could consider clang-format-ing the json output too? :D