This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Surface errors from command-line parsing
ClosedPublic

Authored by ilya-biryukov on Aug 26 2019, 11:02 AM.

Details

Summary

Those errors are exposed at the first character of a file,
for a lack of a better place.

Previously, all errors were stored inside the AST and report
accordingly. However, errors in command-line argument parsing could
result in failure to produce the AST, so we need an alternative ways to
report those errors.

We take the following approach in this patch:

  • buildCompilerInvocation() now requires an explicit DiagnosticConsumer.
  • TUScheduler and TestTU now collect the diagnostics produced when parsing command line arguments. If pasing of the AST failed, diagnostics are reported via a new ParsingCallbacks::onFailedAST method. If parsing of the AST succeeded, any errors produced during command-line parsing are stored alongside the AST inside the ParsedAST instance and reported as previously by calling the ParsingCallbacks::onMainAST method;
  • The client code that uses ClangdServer's DiagnosticConsumer does not need to change, it will receive new diagnostics in the onDiagnosticsReady() callback

Errors produced when parsing command-line arguments are collected using
the same StoreDiags class that is used to collect all other errors. They
are recognized by their location being invalid. IIUC, the location is
invalid as there is no source manager at this point, it is created at a
later stage.

Although technically we might also get diagnostics that mention the
command-line arguments FileID with after the source manager was created
(and they have valid source locations), we choose to not handle those
and they are dropped as not coming from the main file. AFAICT, those
diagnostics should always be notes, therefore it's safe to drop them
without loosing too much information.

Diff Detail

Repository
rL LLVM

Event Timeline

ilya-biryukov created this revision.Aug 26 2019, 11:02 AM
nridge added a subscriber: nridge.Aug 26 2019, 7:18 PM

Thanks! This looks really useful when we can't build an AST due to unknown compiler commands, but I am not sure about how useful it is to surface command-line parsing errors once we are able to build an AST.
Because people most likely won't care about these errors once clangd is working for them, and even get annoyed since most of the developers has little control over how their build system generates their compile
commands, therefore it will be really hard to get rid of those errors showing up in all the files they want to edit.

So I suggest only surfacing these if we are not able to create a compiler invocation or build an ast at all. WDYT?

Thanks! This looks really useful when we can't build an AST due to unknown compiler commands, but I am not sure about how useful it is to surface command-line parsing errors once we are able to build an AST.
Because people most likely won't care about these errors once clangd is working for them, and even get annoyed since most of the developers has little control over how their build system generates their compile
commands, therefore it will be really hard to get rid of those errors showing up in all the files they want to edit.

How common is this? Do we have any particular examples?

So I suggest only surfacing these if we are not able to create a compiler invocation or build an ast at all. WDYT?

My worry comes from the fact that I don't think we can guarantee that clangd is working for the users after we encountered those errors.
In particular, after D66731, we will be getting a compiler invocation and building the AST in many more cases. However, the AST might be completely bogus, e.g. because we did not recognize actually important flags.
Surfacing this error to the user is useful, because that would explain why clangd is rusty, gives weird results, etc.

In turn, the users will report bugs and we'll be able to collect data on whether this is noisy or everyone is doing compile_commands.json wrong and we should do something about it or whether it's actually very rare in practice and turns out to be a useful indicator for the users.
Either result seems plausible.
I'm all in for hiding annoying errors, but I don't know what the failure modes are in that particular case and I'd be tempted to first see whether surfacing the errors would actually end up producing those annoying errors and what the errors are.

How common is this? Do we have any particular examples?

Well, I don't have any data to prove, but my experience has been so far; most of the people don't touch Makefiles apart from adding their source files to the
CMakeLists within their project directory, and only some of them are comfortable with providing additional flags. For example having a new warning flag, or
a flag that has been recently renamed. Then your compiler won't complain during builds, but you will get those bogus errors in clangd.

My worry comes from the fact that I don't think we can guarantee that clangd is working for the users after we encountered those errors.
In particular, after D66731, we will be getting a compiler invocation and building the AST in many more cases. However, the AST might be completely bogus, e.g. because we did not recognize actually important flags.
Surfacing this error to the user is useful, because that would explain why clangd is rusty, gives weird results, etc.

In turn, the users will report bugs and we'll be able to collect data on whether this is noisy or everyone is doing compile_commands.json wrong and we should do something about it or whether it's actually very rare in practice and turns out to be a useful indicator for the users.
Either result seems plausible.
I'm all in for hiding annoying errors, but I don't know what the failure modes are in that particular case and I'd be tempted to first see whether surfacing the errors would actually end up producing those annoying errors and what the errors are.

OK, I was leaning on "not showing up if we could build an AST" side because I thought that in those cases even if we have errors while building CompilerInvocation
these would most of the time be some small things that would not affect how AST was generated in a sense that users would care. Therefore we would end up annoying
users most of the time and be really useful only in some small portion of the cases.

But again this isn't backed by any data, following your approach of turning it on and changing it later on if we annoy people also sounds OK to me. Also as discussed offline
I misunderstood the scope of the diags generated while building compiler invocations. Since this only surfaces errors, it should be mostly OK.

clang-tools-extra/clangd/ClangdUnit.cpp
465 ↗(On Diff #217202)

could you preserve the previous order by inserting CompilerInvocationDiags here instead of the ones in AST ?

clang-tools-extra/clangd/Diagnostics.cpp
399 ↗(On Diff #217202)

I suppose we can get rid of the one in StoreDiags::EndSourceFile now ?

ilya-biryukov marked an inline comment as done.Aug 27 2019, 5:59 AM

Just to clarify: we only want to surface errors, not warnings from command-line to avoid too much nosie; I'm totally on board with not spamming users with "unknown compiler warning" errors at the start of each file.
We would only show things like unknown argument "-mdisable-fp-elim". Given that clang understands most flags from other compilers (to be a good drop-in replacement), we should end up showing errors when things actually can be significantly broken.

And we will now show this errors only in cases when we could not build the AST at all before (as CompilerInvocation was null), so overall we should not be producing too much new noise

clang-tools-extra/clangd/ClangdUnit.cpp
465 ↗(On Diff #217202)

Do you think we should add CompilerInvocatioDiags at the end, rather than at the start of the diagnostics list?
Why would that be better?

Mostly trying to keep the chronological order here: command-line diagnostics came first, followed by preamble, followed by AST diagnostics.

kadircet added inline comments.Aug 27 2019, 6:20 AM
clang-tools-extra/clangd/ClangdUnit.cpp
465 ↗(On Diff #217202)

I totally agree that your current ordering makes more sense. I just wasn't sure why it was done in the opposite way before(first AST, then preamble), if it doesn't cause any test breakages we should be good though.

clang-tools-extra/clangd/Diagnostics.cpp
603 ↗(On Diff #217202)

I suppose this one is also not required anymore, as you clear it during flush

clang-tools-extra/clangd/unittests/TUSchedulerTests.cpp
755 ↗(On Diff #217202)

as discussed offline, could you also add a test to make sure this does not fire on "-Wunknown-warning-flag" ?

ilya-biryukov marked 2 inline comments as done and an inline comment as not done.
  • Do not flushDiag() on EndSourceFile
  • Do not reset WasAdjusted outside flusLastDiag()
  • Add a test that unknown warnings do not produce diagnostics
ilya-biryukov added inline comments.Aug 28 2019, 1:44 AM
clang-tools-extra/clangd/ClangdUnit.cpp
465 ↗(On Diff #217202)

The intention was to do less copies: normally there are less diagnostics in preamble than in the main file, so we would add the main file diagnostics first and later prepend the preamble diagnostics.

Performance-wise it does not matter and the new version reads better as it only appends and the order of items is natural.

clang-tools-extra/clangd/Diagnostics.cpp
399 ↗(On Diff #217202)

I've removed the call to flushDiag() from the function, but kept the function to reset the LangOpts to None for symmetry with setting them on BeginSourceFile.
Don't think this matters in practice, looks a bit better aesthetically (we "clean up" when the file goes out of scope).

  • Remove accidental change
kadircet accepted this revision.Aug 28 2019, 1:45 AM

LGTM, thanks!

This revision is now accepted and ready to land.Aug 28 2019, 1:45 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 28 2019, 2:25 AM