Page MenuHomePhabricator

clangd: Tolerate additional headers

Authored by puremourning on Aug 29 2017, 4:08 PM.



The language server protocol specified 2 headers (Content-Length and Content-Type), but does not specify their sequence. It specifies that an empty line ends
headers. Clangd has been updated to handle arbitrary sequences of headers, extracting only the content length.

Diff Detail


Event Timeline

puremourning created this revision.Aug 29 2017, 4:08 PM
ilya-biryukov requested changes to this revision.Aug 30 2017, 5:00 AM
ilya-biryukov added a reviewer: ilya-biryukov.

Thanks for the fix. Looks good, just a few comments.

138 ↗(On Diff #113175)

Maybe we could move that variable into loop body by splitting the loop body into multiple parts?
Having it outside a loop makes it harder to comprehend the parsing code.
Rewriting to something like this would arguably make it easier to read:

while (In.good()) {
  // Read first line
  llvm::StringRef LineRef = /*...*/;
  // Skip comments
  if (LineRef.startsWith("#"))
  // Read HTTP headers here, reading multiple lines right inside the loop.
  unsigned ContentLength = 0;
  // ....

  // Read ContentLength bytes of a message here.
  std::vector<char> JSON;
156 ↗(On Diff #113175)

I believe that FIXME referred to the fact that the header does nothing in clangd.
For example, one important part of Content-Type is charset, we completely ignore that now.
So we should probably leave a FIXME, maybe rephrasing it a bit.

@bkramer, could you confirm I got the meaning of the comment right?

163 ↗(On Diff #113175)

Maybe log an error if Content-Length is specified twice?

167 ↗(On Diff #113175)

Maybe log the skipped header?

202 ↗(On Diff #113175)

Maybe log the ignored line here?

61 ↗(On Diff #113175)

Maybe also check stderr by piping it into stdout and using a separate -check-prefix?
I mean something like:

# RUN: clangd -run-synchronously < %s | FileCheck %s
# RUN: clangd -run-synchronously < %s  2>&1 >/dev/null | FileCheck -check-prefix=ERR %s
# ...
# ERR: JSON dispatch failed!

Thanks for the review! Few quick responses.

138 ↗(On Diff #113175)

I'll take a look, sure. Something like...

while ( the input stream is good )
  set len to 0
  while ( the input stream is good )
    read a line into header
    if the line is a comment, read another line (continue the inner loop !)

    if header is Content-Length, store the length in len
    otherwise, if header is empty, we're no longer reading headers, break out of the inner loop

  if the input stream is not good, break out of the loop

  if len is 0, reset the outer loop

  read len bytes from the stream, 
  if we read len bytes ok, parse JSON and dispatch to the handlers
156 ↗(On Diff #113175)

Ah that's a good point. I had assumed the FIXME was that header parsing wasn't complete.

OTOH, the language server protocol states:

The content part is encoded using the charset provided in the Content-Type field. It defaults to utf-8, which is the only encoding supported right now.

So it's all a little moot :)

205 ↗(On Diff #113175)

typo "consumed"

61 ↗(On Diff #113175)

Ah neat. I did try and work out a way to do that. Will do this, thanks for the tip.

Keep ContentLength within the loop by using an inner loop to read headers.
Update tests to check stderr correctly.
Add some useful diagnostics to the log.

Tidy logging: missing newlines.

Validate that the duplicate message is printed.

So I think this is ready now. Anything more you need from me?

138 ↗(On Diff #113175)


163 ↗(On Diff #113175)

OK, though I suspect this would only happen in the tests :)

167 ↗(On Diff #113175)

I think this would just be noise for conforming clients which send the (legitimate) Content-Type header. We could of course special case this, but I'm not sure the extra logging would be all that useful.

ilya-biryukov accepted this revision.Sep 4 2017, 3:28 AM

Looks good and ready to land. Thanks for this change.
Do you have commit rights to llvm repo?

This revision is now accepted and ready to land.Sep 4 2017, 3:28 AM
ilya-biryukov added inline comments.Sep 4 2017, 3:33 AM
167 ↗(On Diff #113175)

You're right. Logging Content-Type headers doesn't make any sense.

Looks good and ready to land. Thanks for this change.
Do you have commit rights to llvm repo?

Thanks for the review!

No I don't have commit rights. Can you land for me? Thanks!

This revision was automatically updated to reflect the committed changes.

No I don't have commit rights. Can you land for me? Thanks!