If an import directive was put into wrong context, the error message was obscure,
complaining on misbalanced braces. To get more descriptive messages, annotation
tokens related to modules are processed where they must not be seen.
Details
Diff Detail
Event Timeline
This doesn't recover from the errors properly any more (leaving junk behind in the 'current module' stack), and somewhat breaks the original purpose of these checks, which is to detect and diagnose missing close braces. Here's what I suggest:
- Keep the isEofOrEom calls as-is.
- Diagnose the unexpected EOM tokens as you do now (but customize the diagnostics in the case of a module_end token).
- Once you've diagnosed an EOM token, don't consume it; instead update the annotation so you know not to diagnose it again, and bail out of parsing the current construct.
Then, when we hit an EOM, we'll diagnose it once at the innermost level, bail out all the way to the top level, and then handle it properly.
include/clang/Sema/Sema.h | ||
---|---|---|
1802–1805 | I don't think this is the right interface. The calls from the parser expect an error to be issued, and don't do the right thing if the module import is valid, so this should instead be called something more like diagnoseMisplacedModuleImport, and should either assert if the current context allows module imports or not perform the check. | |
lib/Parse/ParseDeclCXX.cpp | ||
2857–2864 | Please factor this out into a bool TryParseMisplacedModuleImport() function. | |
2858–2859 | For module_end, we probably still want to give the "missing '}'" diagnostic, because that's almost certainly the problem if we reach the end of a module while still within a function or namespace. Maybe pass a flag into checkModuleImportContext to say whether you're entering or leaving the module? | |
2862–2863 | This is not a good way to recover from the problem: for that, we should probably do what we used to do, and bail out to the top level. |
Fixed module boundary treatment.
This version must fix problems in the treatment of annot_modulbegin and
annot_module_end. Error recover for the token annot_module_include found
in wrong context can be made as previously, just by ignoring it. To handle
token 'annot_module_end' parser bails out to upper level, where the module
end can be processed, it will complain on missing '}' etc. As for
'annot_module_start', now parser tries to recover by skipping all content
till matching 'annot_module_end'.
Also made changes according to reviewer's notes.
include/clang/Basic/DiagnosticParseKinds.td | ||
---|---|---|
1016–1017 | These diagnostics are phrased from the point of view of the implementation, and should instead be phrased from the point of view of the user. The first diagnostic would be more useful if it said something like "missing '}' at end of module". That's pretty close to the diagnostic we already produce; maybe instead of changing the behavior of this case you could improve BalancedDelimiterTracker to use a custom diagnostic for the case where the missing closing delimiter is instead an end-of-module token? The second diagnostic seems like an improvement, but from the user's perspective, this wasn't the start of a submodule, it was a module import / #include. Maybe just use diagnoseMisplacedModuleImport here? | |
lib/Parse/ParseDeclCXX.cpp | ||
3070 | This shouldn't be named try* if it has a precondition that the current token is an EoM token. (The try form, if there is one, should check for that token itself.) | |
lib/Parse/Parser.cpp | ||
2028 | This will result in you producing two errors "unexpected end of module" and "missing }" for each construct that is still open at the end of the module. I don't think that's what we want. | |
2032 | This is liable to produce bad follow-on diagnostics; I don't think this is a reasonable way to recover. I can see a few feasible options here:
My preference would be either (1) or (2). But either way, we should diagnose the still-open bracketing constructs, because the problem is likely to be a missing close brace at the end of an unrelated header file. | |
2047 | Again, this will recover very poorly. In this case, our best bet seems to be to recover by importing the module. |
lib/Parse/Parser.cpp | ||
---|---|---|
2032 | I strongly prefer (1). In all cases I have recorded in my notes when modularizing, the error: expected '}' diagnostic indicated one of two things:
For the sake of our users, it is probably best to immediately fatal error and suggest an alternative (the suggestions can be a separate patch; my recommendations are below). I believe a user is most likely to run into this diagnostic when developing an initial set of module maps, so our diagnostic should be tailored to that audience. These users may have little experience with modules when they encounter this diagnostic, so a notes suggesting a likely remedy helps them develop confidence by providing authoritative advice (and avoiding a round trip to the documentation). My fine-grained recommendations from experience are:
".inl" files are theoretically a problem inside namespaces, but I have not observed this issue in practice. In my experience code taking the effort to have ".inl" files is quite clean and avoids such textual dependencies. The issues I have seen in practice with ".inl" files usually end up coming out through a different part of clang. Improving that is for a separate patch though. |
Updated patch
Thanks to all for fruitful discussion!
The new version tries to addresses review notes. It differs from the previous version in:
- Method tryParseMisplacedModuleImport now depends only on current token and is used in loop condition. Probably a better name for it is something like skipMisplacedModuleImport.
- BalancedDelimiterTracker is updated so that it prints different message if module ended before closing bracket has been seen.
- If annot_module_begin is seen where it must not, recovery depends on context. Inside a namespace the module is imported as a recovery, a note suggesting to move import to top level is issued. Inside a class of a compound statement fatal error issued, there is no way to handle this case at upper level.
test/Modules/auto-module-import.m | ||
---|---|---|
89 | I think what this means is marking the *header* as textual. So I would suggest the wording "consider marking the header as textual". |
Sorry for the delay. I'm generally happy with the direction of this patch; I think this will substantially improve how we diagnose this set of problems.
include/clang/Basic/DiagnosticParseKinds.td | ||
---|---|---|
1016 | I would drop the "the" in this error. | |
include/clang/Basic/DiagnosticSemaKinds.td | ||
7774 | We'll already have said something like: error: import of module 'foo' appears within class X note: class X begins here There are two likely situations here: either the header was intended to be included textually, or the user forgot a } or similar and didn't expect for the import to be inside the class. Either way, this note is not useful (either they thought it *was* at the top level, or the note is not applicable). How about dropping this note? The "begins here" note seems to have already covered the interesting case. | |
7775 | Textual headers are not the common case, so "consider" seems too strong; how about weakening this to something like: "if this header is intended to be included textually, mark it 'textual' in the module map" | |
lib/Parse/ParseDeclCXX.cpp | ||
213 | Move this try... call to the end of the && here and elsewhere; it's much more likely that we'll hit an r_brace. | |
lib/Parse/Parser.cpp | ||
2023–2024 | This is the opposite of how all the other try functions behave: they return true if they consumed the thing they were asked to consume. | |
2025–2027 | Refactor this as follows: Rename this to parseMisplacedModuleImport. Add an inline tryParseMisplacedModuleImport to Parser.h, which checks if the current token is an EOM token. If it is, it calls this function. If it is not, it returns immediately. The purpose of having these try functions with thin wrappers in the .h file is so that we can inline the cheap "are we looking at the right kind of token" check (and allow it to be combined with other nearby checks on the token kind). | |
lib/Sema/SemaDecl.cpp | ||
14361–14366 | It looks like this will currently produce repeated diagnostics for an import within multiple namespaces: module.map: module Foo { module X { header "foo1.h" } module Y { header "foo2.h" } } foo1.h: namespace A { namespace B { #include "foo2.h" } } I believe this will diagnose:
I think we should treat this case as fatal too. | |
14372 | I think we should only suggest changing the module map if our current module and the module of the included file are within the same top-level module. It would be unfortunate for us to suggest modifying, say, the libc module map if we see some end user's code including a C library header in the wrong context. You can check M->getTopLevelModuleName() against S.getLangOpts().CurrentModule and S.getLangOpts().ImplementationOfModule to see whether it's a submodule of the current module. |
include/clang/Basic/DiagnosticSemaKinds.td | ||
---|---|---|
7774 | I think this makes sense. error: ... import ... appears within ... seems like sufficient information about the nature of the error. | |
lib/Sema/SemaDecl.cpp | ||
14372 | From my experience, your suggested heuristic is insufficient; we definitely need to cover the case that crosses top-level modules (or is from a .cpp file to a module it imports); actually, that is the only case that I have run into in practice. In this patch, I think the right thing is to just not emit note_module_need_top_level nor note_module_need_textual. A separate patch can use a flag -Wintial-modules-migration (or whatever) to emit these notes a bit more aggressively (the flag would be off by default). | |
test/Modules/misplaced.cpp | ||
1–12 | This test does not cover the appears within class case. |
Thanks for helpful notes!
There are similar cases in Objective C. Should they be implemented in this patch or separate patch is OK?
include/clang/Basic/DiagnosticParseKinds.td | ||
---|---|---|
1016 | Done. | |
include/clang/Basic/DiagnosticSemaKinds.td | ||
7774 | The message is removed. | |
7775 | This message is also removed, as Sean proposed. | |
lib/Parse/ParseDeclCXX.cpp | ||
213 | Moved. | |
lib/Parse/Parser.cpp | ||
2023–2024 | Meaning of the return value is inverted. | |
2025–2027 | Thank you for explanation. Fixed. | |
lib/Sema/SemaDecl.cpp | ||
14361–14366 | Yes, the messages were as you described. Now import inside a namespace is also a fatal error. Test misplaced-4.cpp checks this case. | |
14372 | As Sean proposed, these are not emitted. | |
test/Modules/misplaced.cpp | ||
1–12 | This case was checked in malformed.cpp but for the sake of consistency separate test was added. |
include/clang/Parse/Parser.h | ||
---|---|---|
2560–2562 | Use Tok.isOneOf here. |
I would drop the "the" in this error.