This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Surface diagnostics from headers inside main file
ClosedPublic

Authored by kadircet on Mar 13 2019, 9:09 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

kadircet created this revision.Mar 13 2019, 9:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 13 2019, 9:09 AM

I'll look more closely into the details, but just a high-level question now: why would we want to make this optional and not simply surface these extra diagnostics?

I'll look more closely into the details, but just a high-level question now: why would we want to make this optional and not simply surface these extra diagnostics?

The optional part is rather limiting number of diagnostics that is going to be surfaced. So if you have thousands of errors inside preamble only first 100 of them will appear inside main file by default.
But looking at it again, currently there is no way for user to say "do not limit number of diagnostics" :D They can only change this limit to different values.

kadircet updated this revision to Diff 191265.Mar 19 2019, 3:07 AM
  • Only surface diagnostics with level fatal or error

The optional part is rather limiting number of diagnostics that is going to be surfaced. So if you have thousands of errors inside preamble only first 100 of them will appear inside main file by default.
But looking at it again, currently there is no way for user to say "do not limit number of diagnostics" :D They can only change this limit to different values.

Let's pick a number and hard-code it. We'll get consistent behavior in all clients of clangd, I can hardly think of any existing client that would want to surface everything to the user.
FWIW, 100 seem too much to my taste, since we're only showing errors. Would 10 be a better limit?

ilya-biryukov added inline comments.Mar 19 2019, 3:24 AM
clangd/ClangdUnit.cpp
256 ↗(On Diff #191265)

We should find a way to point into an exact include directive (source locations have this information).

406 ↗(On Diff #191265)

Could we limit per-include-directive instead?
The limit to avoid creating too verbose error messages, rather than reporting too many messages, right?

438 ↗(On Diff #191265)

We should provide the include stack, similar to how compiler does it.
Otherwise it would be really hard to figure out the exact problem the diagnostic is pointing at.

kadircet updated this revision to Diff 191323.Mar 19 2019, 9:10 AM
kadircet marked 2 inline comments as done.Mar 19 2019, 9:10 AM
  • Show include stack in diagnostic message
  • Point to exact include location
kadircet planned changes to this revision.Mar 19 2019, 10:24 AM
  • Limit per include directive
  • Use hardcoded value for limit
kadircet updated this revision to Diff 191465.Mar 20 2019, 3:08 AM
kadircet marked an inline comment as done.
  • Make limit a hardcoded value rather then a command line argument
  • Limit diags per include directive

Thanks for changes. A few more comments.

clangd/ClangdUnit.cpp
259 ↗(On Diff #191465)

Could we binary-search in a sorted vector<Inclusion> with a custom comparator instead?

411 ↗(On Diff #191465)

The function name suggests it does not have side-effects.
Maybe handle the update and query of NumberOfDiagstAtLine outside this function instead?

clangd/Diagnostics.cpp
396 ↗(On Diff #191465)

NIT: could we reuse the function from clang that does the same thing?

The code here is pretty simple, though, so writing it here is fine. Just wanted to make sure we considered this option and found it's easier to redo this work ourselves.

419 ↗(On Diff #191465)

We can still drop diagnostics at some point, right?
Could we move the log statement in a different point?

clangd/Diagnostics.h
87 ↗(On Diff #191465)

Could we avoid changing this interface?
We have enough information to fill the corresponding range and message when collecting diagnostics inside our implementation of clang's DiagnosticConsumer, there's really no point in carrying this information in this struct.

If that means intermediate structures stored in our consumer, so be it, it's a private implementation detail. Diag is a public interface, so keeping it simpler is valuable.

unittests/clangd/DiagnosticsTests.cpp
773 ↗(On Diff #191465)

NIT: maybe use raw string literals to keep the message a one-liner?

779 ↗(On Diff #191465)

Could we add a test for reaching the limit?
Could we mention that some diagnostics were dropped in that case? Something like:

In file included from a.h:20:1:
error: unresolved identifier 'foo'.

And 10 more diagnostics...
kadircet updated this revision to Diff 191831.Mar 22 2019, 2:16 AM
kadircet marked 8 inline comments as done.
  • Address comments
clangd/Diagnostics.cpp
396 ↗(On Diff #191465)

there is TextDiagnostic which prints the desired output expect the fact that it also prints the first line saying the header was included in main file. Therefore, I didn't make use of it since we decided that was not very useful for our use case. And it requires inheriting from DiagnosticRenderer which was requiring too much boiler plate code just to get this functionality so instead I've chosen doing it like this.

Can fallback to TextDiagnostic if you believe showing Included in mainfile.cc:x:y: at the beginning of the diagnostic won't be annoying.

unittests/clangd/DiagnosticsTests.cpp
779 ↗(On Diff #191465)

There is already one, see LimitDiagsOutsideMainFile

nridge added a subscriber: nridge.Mar 24 2019, 6:40 PM
ilya-biryukov added inline comments.Apr 7 2019, 11:23 PM
clangd/ClangdUnit.cpp
405 ↗(On Diff #191831)

The name suggests we filter all diagnostics based on this helper.
Maybe add rename to something more specific?

E.g. IsErrorOrFatal or IsImporantHeaderDiagnostic...

408 ↗(On Diff #191831)

NIT: simplify to

return D.Severity == DiagnosticsEngine::Level::Error
    || D.Severity == DiagnosticsEngine::Level::Fatal
414 ↗(On Diff #191831)

NIT: rename to AddDiagnosticFromInclude or something similar, but more specific than the current name

425 ↗(On Diff #191831)

Why not merge the ShouldSurfaceDiag and TryAddDiag into a single function and handle all the complexities there?
This would simplify the client code, it would be as simple as

else
 AddDiagnosticFromInclude(D)
434 ↗(On Diff #191831)

Can we do this in StoreDiags::flushLastDiag?
All code handling the diagnostics lives there and it has all the information necessary to map to the included line.

445 ↗(On Diff #191831)

This extra complexity does not seem to be worth it after all.
I'd suggest to remove this (at least from the first version of the file), even though I was the one who proposed it.

Still think it's valuable, but the patch is complicated enough as it is, keeping it simpler seems to be more important at this point.

clangd/Diagnostics.cpp
371 ↗(On Diff #191831)

That's a lot of code, could we extract this into a separate function?

372 ↗(On Diff #191831)

Why use getPresumedLoc? Making clangd sensitive to pp directives that rename file or change line numbers does not make any sense.

Could you also add tests for that?

396 ↗(On Diff #191465)

LG, it's not too hard to reconstruct the same output.
Note that D60267 starts outputting notes in a structured way, using the RelatedInformation struct from the LSP.

We should eventually add the include stack into related information too. With that in mind, we should probably add the include stack as a new field to the struct we use for diagnostics.

kadircet updated this revision to Diff 194528.Apr 10 2019, 8:50 AM
kadircet marked 10 inline comments as done.
  • Address comments
clangd/Diagnostics.cpp
372 ↗(On Diff #191831)

It was the way DiagnosticRenderer emitted include stacks, I had just copied it over.
Changing with SourceManger::getIncludeLoc

396 ↗(On Diff #191465)

SG, delaying serialization to LSP layer then.

Sorry, lost this revision. Looking now

ilya-biryukov added inline comments.Apr 16 2019, 2:00 AM
clangd/ClangdUnit.cpp
251 ↗(On Diff #194528)

NIT: use llvm::lower_bound

252 ↗(On Diff #194528)

NIT: lower_bound also works when you work on a different type, no need for the temporary variable.

lower_bound(MainFileIncludes…, [](const Inclusion &L, const Position &Pos) {
  return L.start.line < Pos.line;
})

(the order of lamda parameters might need to be reversed, I always forget what's the correct one)

255 ↗(On Diff #194528)

NIT: add a sanity check that the include was actually there: assert(Res->R == Pos)

clangd/Diagnostics.cpp
55 ↗(On Diff #194528)

reduce nesting by inverting the condition:

if (IncludeStack.empty())
  return;
// rest of the code
205 ↗(On Diff #194528)

Could we get the message first and the include stack later?
The first line is often showed in various lists and having In file included from there is not very helpful.

463 ↗(On Diff #194528)

Could you inline ShouldAddDiag into its single callsite?

clangd/Diagnostics.h
87 ↗(On Diff #194528)

Could you store pair</*Filename*/string, Position> instead?
Would make it simpler to adapt D60267 to this change.

131 ↗(On Diff #194528)

NIT: use tripple slash comments

132 ↗(On Diff #194528)

NIT: the name makes me believe this does a different thing. Maybe mention "include" in the name to make it less confusing?
Something like DiagsInsideIncludes would work.

Another NIT: we don't need a map anymore: DenseSet<int> should be enough

unittests/clangd/DiagnosticsTests.cpp
54 ↗(On Diff #194528)

NIT: maybe simplify to std::equal(IncludeStack.begin(), IncludeStack.end(), arg.IncludeStack.begin())

620 ↗(On Diff #194528)

We were discussing adding the extra files to TestTU instead.
Any reason this did not work out?

kadircet updated this revision to Diff 195375.Apr 16 2019, 7:16 AM
kadircet marked 14 inline comments as done.
  • Address comments
unittests/clangd/DiagnosticsTests.cpp
620 ↗(On Diff #194528)

Yes, my LRU cache evicted the discussion :D

ilya-biryukov added inline comments.Apr 25 2019, 5:57 AM
clangd/Diagnostics.cpp
78 ↗(On Diff #195375)

Could we avoid introducing a function that breaks the invariant of a type?
Having a function to print Position differently would be a much better fit.

225 ↗(On Diff #195375)

WDYT about changing this to a shorter form?

include stack:

./project/foo.h:312
/usr/include/vector:344

Note that it does not mention column numbers as they aren't useful and is shorter.
Since we're already not 100% aligned with clang, why not have a more concise representation?

unittests/clangd/DiagnosticsTests.cpp
739 ↗(On Diff #195375)

So the last location in the include stack is actually the original location of an error.
This is non-obvious design. Maybe move this into a separate field?

unittests/clangd/TestTU.h
51 ↗(On Diff #195375)

Maybe use relative paths?
This would be consistent with Filename and HeaderFilename

52 ↗(On Diff #195375)

Why not StringMap</*Content*/std::string>?
This would be consistent with buildTestFS and disallow adding duplicates.

kadircet updated this revision to Diff 196822.Apr 26 2019, 2:49 AM
kadircet marked 5 inline comments as done.
  • Address comments
  • Drop include stack
kadircet added inline comments.Apr 26 2019, 2:50 AM
clangd/Diagnostics.cpp
78 ↗(On Diff #195375)

No longer needed reverting.

225 ↗(On Diff #195375)

As discussed offline getting rid of include stack in this version of the patch.

ilya-biryukov added inline comments.Apr 29 2019, 12:54 AM
clangd/Diagnostics.cpp
115 ↗(On Diff #196822)

replace vector<SourceLocation> with SourceLocation now that we only need one of the locations.

unittests/clangd/DiagnosticsTests.cpp
748 ↗(On Diff #196822)

NIT: maybe quote the name of the header for better readability: error in include 'c.h' ...?

This looks nice and minimal, thanks!
Happy to LGTM this, unless @sammccall want to take another look (e.g. to account for related information patch)

kadircet updated this revision to Diff 197068.Apr 29 2019, 1:54 AM
kadircet marked 2 inline comments as done.
  • Address comments
kadircet marked 2 inline comments as done.Apr 29 2019, 1:54 AM
sammccall accepted this revision.Apr 29 2019, 2:03 AM

Nice! Just readability/wording nits

clangd/Diagnostics.cpp
137 ↗(On Diff #196822)

Clang tends to use a phrase ending in "here" for such notes.
Suggest "error occurred here"

137 ↗(On Diff #196822)

message should not be capitalized (capitalization will be added later if needed)

141 ↗(On Diff #196822)

Include is not a noun. "included file"?

141 ↗(On Diff #196822)

"Error" is the severity, which appears elsewhere in LSP. (Other messages do not include it)

141 ↗(On Diff #196822)

message should not be capitalized

142 ↗(On Diff #196822)

why are we including the filename here?

it obscures the error message, and will be available via the note.

463 ↗(On Diff #194528)

Does this improve readability over (inline) D.Severity >= DiagnosticsEngine::Error?

clangd/Diagnostics.h
141 ↗(On Diff #196822)

nit: "containing an error" (and variable name)

IncludeLinesWithErrors might be a clearer name

This revision is now accepted and ready to land.Apr 29 2019, 2:03 AM
kadircet updated this revision to Diff 197073.Apr 29 2019, 2:31 AM
kadircet marked 8 inline comments as done.
  • Address comments
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2019, 3:23 AM