This is an archive of the discontinued LLVM Phabricator instance.

Try building complete AST after a fatal error was emitted if further diagnostics are expected
Needs ReviewPublic

Authored by Dmitry.Kozhevnikov on Aug 8 2018, 10:05 AM.

Details

Summary

Related to https://reviews.llvm.org/D50455

There is some code here and there that assume that no sane output is required if a fatal error has occurred. When using clangd/libclang, it's no longer true: diagnostics might still be requested, and AST might still be required for other IDE/tooling features, so it has to be as complete as possible.

Here I try to separate the following use cases:

  1. Some clients check hasFatalErrorOccurred() because they are known to work unstable in presence of compile errors and want to mitigate it - they'll work as before
  2. However, we don't want to take shortcuts in PP and Sema and still want to process include directives and instantiate templates

Note: I've found out that initially the flag in DiagnosticsEngine (which is now called SuppressAfterFatalError) had different meaning and was just demoting all fatal errors to non-fatal (https://reviews.llvm.org/rL262318). This would also fix this issue, however, it was partly reverted in https://reviews.llvm.org/rL301992 to the current state. Maybe we should go with the old approach instead (I assume the issue was that this flag was not serialized/restored, but probably should?)

Diff Detail

Repository
rC Clang

Event Timeline

This seems sensible to me.

include/clang/Basic/Diagnostic.h
758

Looks like this line wasn't formatted with clang-format.

On a second look I think that there is value separating the concepts of 'producing diagnostics' after hitting the fatal error (which is SuppressAfterFatalError currently does), and trying to build a more complete AST.
For example,

  • An editor client might not want to show the diagnostics after the fatal error has been reached to match the diagnostics observed during build, but it would benefit from a more complete AST for other editing features.
  • Index-while-building: the compiler shouldn't show the diagnostics after a fatal error has been reached, but the indexer would be able to produce better indexing data from a more complete AST.
Dmitry.Kozhevnikov changed the repository for this revision from rCTE Clang Tools Extra to rC Clang.

Addressed the review comment about separating "suppressing diagnostics" from "providing more complete AST"

On a second look I think that there is value separating the concepts of 'producing diagnostics' after hitting the fatal error (which is SuppressAfterFatalError currently does), and trying to build a more complete AST.

Sorry for the delay.

I've introduced an accordingly-named method. I haven't created another flag yet, though, because it feels a bit weird currently since there are no clients that require it, added a todo note instead. What do you think?

unittests/Frontend/PCHPreambleTest.cpp
220

The condition in SemaTemplateInstantiate.cpp was causing the lookup of Alias to fail: it was marked as invalid and wasn't stored in the preamble. I'll be happy to make a more straightforward test, but I've not yet managed to: when placing everything in a single file, the lookup succeeds.

jkorous added a subscriber: jkorous.

Adding Volodymyr who landed related patch recently.

Have you checked that produced AST is of sufficient quality to be used? Because, for example, for invalid code error recovery tries to keep going but the end result isn't always good.

In the test you verify that you can ignore bogus includes. Is this the real-world use case you want to address? Or something like "stddef.h not found"?

Have you checked that produced AST is of sufficient quality to be used? Because, for example, for invalid code error recovery tries to keep going but the end result isn't always good.
In the test you verify that you can ignore bogus includes. Is this the real-world use case you want to address? Or something like "stddef.h not found"?

I'm assuming it should help when the recovery is good enough (e.g. when the symbols from the missing header are used rarely, or only used inside function bodies). For the case like "stddef.h not found" there is a chance that resulting AST wouldn't be too useful, however, for IDE-like tasks, where we usually want to get as much info from the incomplete code as possible, it's not too bad IMO (comparing to just disabling all IDE features altogether after a missing include).

The only required property is that the resulting issues shouldn't be catastrophic (further processing shouldn't crash or hang), however, it would be a separate issue - after all, you always can remove the include directive in question.

The main use case I have is that it's inconvenient to browse or edit code in clang-based IDEs when either:

  • there is a typo in an include directive somewhere (e.g. because I'm still modifying the code, moving files around, etc)
  • there is a missing/misconfigured 3rd-party dependency. Obviously, the code is not supposed to compile, however, I'd expect e.g. "Goto Declaration" still work for unrelated code, assuming it recovered properly, which, in my anecdotical experience, it usually does
  • (a very specific use case I'm often hitting) you often have missing includes when you browse llvm/clang code base and haven't built some part of it, so you don't have generated headers. The parser usually recovers very well in this case, if not for these two cut-offs I'm changing in this patch

Thanks, Dmitry, for your explanation. It does help to understand your use case better.

What about having a mode that treats missing header as non-fatal error? Because I believe there are cases where there is no sense to continue after a fatal error. For example, consider hitting the error limit. Showing too many errors isn't useful and also after 50 errors it is a little bit too optimistic to assume that AST is in a good shape. I don't know if there are other fatal-but-we-can-continue errors and if we need a more general solution. So far looks like missing include is the biggest blocker for IDE-like functionality.

Sorry for chiming in, wanted to add my 2 cents to the conversation.

What about having a mode that treats missing header as non-fatal error? Because I believe there are cases where there is no sense to continue after a fatal error. For example, consider hitting the error limit. Showing too many errors isn't useful and also after 50 errors it is a little bit too optimistic to assume that AST is in a good shape. I don't know if there are other fatal-but-we-can-continue errors and if we need a more general solution. So far looks like missing include is the biggest blocker for IDE-like functionality.

Your arguments are definitely valid (too many errors may be confusing and AST may not be prepared to recover on some fatal errors), but I would argue that the IDE use-case is different enough from the command-line tools to make the new behavior a preferred one.

Specifically, a few questions regarding the aforementioned fatal errors:

  1. Why should hitting some error limit be considered a fatal error? One of the reasons I see for the command-line tools is not spamming the output buffers with too many errors, so that navigating to the first errors is easier. It looks like a non-issue for IDEs: they can both show all emitted errors in the text editor and make it easy to navigate to the first error (by providing a convenient UI for an error list, shortcuts to go to the first error, etc.). On the contrary, not seeing errors where I type because clang hit the limit before my line looks like a confusing experience for the editors/IDEs.
  1. Why should an unresolved include (which is considered a fatal error) give a totally different result from the missing include (which is, obviously, undetectable)? They both require the same amount of work to recover and we obviously want clang to work in absence of some includes.

Besides, errors can also be easily "promoted" to fatal with command-line flags (-Wfatal-errors) and editor tools should probably respect those and not override their severity.

W.R.T. to the AST not being good enough: it may not be, but shouldn't we instead invest in improving the recovery on things like unresolved references?
This would give better experience in absence of non-fatal errors too (common case when copy-pasting code, removing too many #include directives from the file, etc.) and it looks doable.

Overall, I would argue that letting clang recover from fatal errors is the right thing to do for IDEs and editor integrations and the original patch was moving in the right direction.
WDYT?

When lookin through the list of the fatal errors, I think there are different categories:

  1. "Scary" ones - "module was relocated", "invalid vfs overlay file", we probably shouldn't try to recover from it
  2. "User" errors (include not found, too many errors) - we definitely shouldn't treat them as fatal in IDE
  3. (subclass of the previous one): "something is too deep" (instantiations, brackets) - I'm afraid treating them as non-fatal right now would lead to a possibility of them happening again and again as we recover and proceed. So, in the future, the recovery might be more clever (i.e. going all the way up the instantiation/brackets stack, and then proceeding normally), however, while it's not done, I'm a bit uneasy demoting them from fatal.

So I'm preparing an alternative patch to demote "file not found" in include directive from the fatal error in .td file, and then immediately promote it back by default (but not in clangd). WDYT?

In general, I find the whole concept of "Fatal error occurred/Uncompilable error occurred/Too many errors occurred/etc" too coarse-grained for tooling/IDEs. Parse/PP/Sema should probably try harder to recover, and not report such global state change. I'm not ready to approach it, though :)

When lookin through the list of the fatal errors, I think there are different categories:

  1. "Scary" ones - "module was relocated", "invalid vfs overlay file", we probably shouldn't try to recover from it
  2. "User" errors (include not found, too many errors) - we definitely shouldn't treat them as fatal in IDE
  3. (subclass of the previous one): "something is too deep" (instantiations, brackets) - I'm afraid treating them as non-fatal right now would lead to a possibility of them happening again and again as we recover and proceed. So, in the future, the recovery might be more clever (i.e. going all the way up the instantiation/brackets stack, and then proceeding normally), however, while it's not done, I'm a bit uneasy demoting them from fatal.

So I'm preparing an alternative patch to demote "file not found" in include directive from the fatal error in .td file, and then immediately promote it back by default (but not in clangd). WDYT?

It feels we're uncertain whether continuing on fatal errors would work, so we choose to not do it.
E.g., about the scary ones:

  • on "invalid vfs overlay file" we wouldn't even run to the parsing, as it fires when parsing compile args and noone recovers from that.
  • on "module relocated". Haven't seen this one, so speculating here. If this happens at import directives, can we just pretend the import was unresolved and continue?

Having a mechanism to recover on specific errors seems like a more fine-grained approach that should work, but the current version of the patch looks very simple and if it turns out it works.
It would be a shame to have the added complexity if the simple solution would also work.

And there is -Wfatal-errors that turns every error into fatal, how would promote/demote approach work in that case?
Maybe we can first add an option, experiment with it and see if it works? And if not, it would give us enough data to classify the failure modes and fix them?

I wonder what others think about it.

In general, I find the whole concept of "Fatal error occurred/Uncompilable error occurred/Too many errors occurred/etc" too coarse-grained for tooling/IDEs. Parse/PP/Sema should probably try harder to recover, and not report such global state change. I'm not ready to approach it, though :)

Totally agree, I can't imagine a fatal error that makes it absolutely impossible for parser to recover (after the parser starts, i.e. the compilation args are correct). Clang can (and will) produce too much noise when recovering, but that's up to IDEs/tools to fix this.

PS we definitely want recovery from missing includes in clangd.
We have this interesting problem when building the preamble:

#include "resolved.h" // imagine this defines a struct named 'Foo'
#include "unresolved.h"

int main() {
  Foo f;
  f. // <-- complete after dot here
}

In that case, completion would work even when the second header is unresolved.
However, if we swap the includes, completion stops working because the fatal error (unresolved include) happens before the definition of the struct, so preamble does not contain the struct definition.

Agree that fatal/non-fatal error is too coarse and tooling/IDEs need more details and more control to provide better experience. But I don't think we are in a state to claim that all errors are recoverable (in theory and in current implementation). Instead of continuing on all errors, I prefer to select errors that are important for tooling and improve those first.

Regarding the current patch, I don't like creating coupling between hasFatalErrorOccurred and shouldRecoverAfterFatalErrors. Looks like after this patch you'll need to call these methods together in many cases. For example, probably Sema::makeTypoCorrectionConsumer in

if (Diags.hasFatalErrorOccurred() || !getLangOpts().SpellChecking ||
    DisableTypoCorrection)
  return nullptr;

should check shouldRecoverAfterFatalErrors too.

nridge added a subscriber: nridge.May 28 2021, 1:07 AM