This is an archive of the discontinued LLVM Phabricator instance.

[modules] PR20507: Avoid silent textual inclusion.
ClosedPublic

Authored by silvas on Jun 12 2015, 3:29 PM.

Details

Summary

This pretty naively follows the implementation suggested by Richard.

I'd like to switch these over to be fatal errors (since they are
essentially similar to err_pp_file_not_found), but I'm still investigating
some test failures that happen when I switched them over.

I'm really not a fan of the `M->isAvailable(getLangOpts(), getTargetInfo(),
Requirement, MissingHeader)` function; it seems to do too many things at
once, but for now I've done things in a sort of awkward way.

The changes to test/Modules/Inputs/declare-use/module.map
were necessitated because the thing that was meant to be tested there
(introduced in r197805) was predicated on silently falling back to textual
inclusion, which we no longer do.

The changes to test/Modules/Inputs/macro-reexport/module.modulemap
are just an overlooked missing header that seems to have been missing since
this code was committed (r213922), which is now caught.

Diff Detail

Repository
rL LLVM

Event Timeline

silvas updated this revision to Diff 27609.Jun 12 2015, 3:29 PM
silvas retitled this revision from to [modules] PR20507: Avoid silent textual inclusion..
silvas edited the test plan for this revision. (Show Details)
silvas added reviewers: rsmith, benlangmuir, djasper.
silvas updated this object.
silvas added a subscriber: Unknown Object (MLST).
rsmith added inline comments.Jun 12 2015, 3:46 PM
lib/Lex/ModuleMap.cpp
346–347 ↗(On Diff #27609)

Should we also teach isBetterKnownHeader to consider an available module to be better than an unavailable one? (If I have two modules covering a header file, and one of them requires "blah" and the other requires "!blah", we should presumably pick the available module.)

silvas added inline comments.Jun 12 2015, 8:29 PM
lib/Lex/ModuleMap.cpp
346–347 ↗(On Diff #27609)

I'm having a hard time thinking of a legitimate use for that, or in general to ever have two modules referencing the same file. Do you have something in particular in mind?

But that does raise the larger question is how to make this insensitive to whatever order we are iterating here (which I assume is determined in some relatively fragile way by the order of command line arguments or order in the module map file). From a user's perspective, it is extremely jarring to have the compiler's behavior depend in a fragile way on such things (like the unix linker order-dependence fiasco). I would prefer to just have a hard error if two modules reference the same header, thus we wouldn't even be iterating here at all, and users will always be immediately alerted to errors.

benlangmuir added inline comments.Jun 16 2015, 2:11 PM
include/clang/Basic/DiagnosticCommonKinds.td
89–90 ↗(On Diff #27609)

When this diag was only used for @import it was okay because you get a separate diagnostic telling you that the module build failed, so it was clear what happened.

@import Foo; error: header 'bar.h' not found
error: could not build module 'Foo'

But now it's odd to get an error about a different header than the one you included with no indication that it's because of a module:

#include <foo.h> // error: header 'bar.h' not found

We could shoehorn the module name into the diagnostic, or perhaps emit a note on the header decl in the module map file.

benlangmuir added inline comments.Jun 16 2015, 2:13 PM
include/clang/Basic/DiagnosticCommonKinds.td
89–90 ↗(On Diff #27609)

Err, reading fail on my part, you're already emitting this at the header decl location. Maybe we should have a note for the module import location though.

silvas updated this revision to Diff 27809.Jun 16 2015, 7:02 PM

Add note about location where implicit import happened.

Any ideas on how to better explain/word it would be appreciated.

benlangmuir added inline comments.Jun 19 2015, 7:36 AM
include/clang/Basic/DiagnosticLexKinds.td
626 ↗(On Diff #27809)

I'd just drop "submodule of top-level ".

silvas added inline comments.Jun 19 2015, 4:18 PM
include/clang/Basic/DiagnosticLexKinds.td
626 ↗(On Diff #27809)

I think it's an important qualification because the module that was in fact imported is not %0. %0 is the top-level module containing it, and the error is because some submodule of %0 (perhaps the one being imported, perhaps not) has the missing file.

From a user's perspective: "The header I included didn't ask for %0 to be imported. It asked for this submodule of it". Ideally we would be able to somehow communicate the "unity build" aspect of top-level modules for this to make sense, but I couldn't find a way to fit all that in a diagnostic.

Does that make sense?

benlangmuir edited edge metadata.Jun 30 2015, 6:37 AM

From a user's perspective: "The header I included didn't ask for %0 to be imported. It asked for this submodule of it"

Alternatively, the user may not be aware of the internal structure of a top-level module and just expect it to work like it works with header includes :-) I'd still prefer the diagnostic without "submodule of top-level", which is also consistent with our "While building module" include-stack style notes. But I don't think this is worth holding up your patch over.

This question still seems to be outstanding:

So concretely for the present patch just the change to isBetterKnownHeader to take into account of isAvailable?

Richard, can you comment?

I tried this patch out and found a few issues I've commented on inline. It also exposes the two issues below:

  1. This patch exposes a compatibility issue with our 'Darwin' module map file (the module map that covers most of /usr/include on Darwin platforms).
error: module 'Darwin.C.excluded' requires feature 'excluded'

This "excluded" submodule was used as a hack for "assert.h" before the explicit support for "exclude" and "textual" headers were added to clang. After this patch, it won't be possible to include assert.h on Darwin, because the module it is in is missing a requirement. I think it would be very bad to break compatibility here, so I propose we add a compatibility hack to module map parsing that treats a module with a the requirement "excluded" as if all its headers were declared with "exclude". What do you think? I'm happy to implement this.

  1. If a require decl is written *after* a header decl in a module, that header still gets treated as required. We should fix that before this is committed. E.g.
module Top {
  module A { header "foo.h" requires nope }
  module B { requires nope header "bar.h" }
}

We'll complain about missing foo.h, but not about bar.h.

lib/Lex/PPDirectives.cpp
1671 ↗(On Diff #27809)
  1. This uses MissingHeader.FileNameLoc, which we just proved is invalid :-)
  2. We should also show diag::note_implicit_top_level_module_import_here in this diagnostic.
test/Modules/missing-header.cpp
13 ↗(On Diff #27809)

We should have a test for including a header from a module with a missing requirement. And one where a missing header from a different submodule is okay because the submodule is missing a requirement.

rsmith edited edge metadata.Jul 1 2015, 3:47 PM
  1. This patch exposes a compatibility issue with our 'Darwin' module map file (the module map that covers most of /usr/include on Darwin platforms).
error: module 'Darwin.C.excluded' requires feature 'excluded'

This "excluded" submodule was used as a hack for "assert.h" before the explicit support for "exclude" and "textual" headers were added to clang. After this patch, it won't be possible to include assert.h on Darwin, because the module it is in is missing a requirement. I think it would be very bad to break compatibility here, so I propose we add a compatibility hack to module map parsing that treats a module with a the requirement "excluded" as if all its headers were declared with "exclude". What do you think? I'm happy to implement this.

Should this be textual instead of exclude? That's generally a better way to handle assert.h, I think. (That said, I don't think it really matters, because no-one cares which module assert.h is in, and we don't yet build PTH for textual headers.)

So concretely for the present patch just the change to isBetterKnownHeader to take into account of isAvailable?

Yes, I think so.

Should this be textual instead of exclude? That's generally a better way to handle assert.h, I think. (That said, I don't think it really matters, because no-one cares which module assert.h is in, and we don't yet build PTH for textual headers.)

Yeah, assert.h should really be textual. I just thought it seemed odd that marking something requires excluded would correspond to textual instead of exclude. I think either would be fine, really. Also, we'd probably want a warning about this behaviour.

And... I just discovered another module map bug this change exposed. We mislabeled the IOKit.avc submodule as requires cplusplus. After this patch, the IOKit module (and therefore all of Cocoa) won't build outside C++ modes :-( I propose an egregious special case hack that checks for the offending module by name and drops the bogus requirement.

rsmith added a comment.Jul 1 2015, 5:26 PM

I'm fine with special-case hacks to keep the current Darwin module maps working, so long as they're relatively self-contained and very unlikely to trigger in any other cases. (We do equally ugly things to parse broken libstdc++ headers.)

I agree that exclude makes more sense for requires excluded as a "supported" feature, but if this is another egregious hack for Darwin, I don't think it matters which we choose. (Maybe we should also check the module name here?)

silvas updated this revision to Diff 29234.Jul 7 2015, 8:14 PM
silvas edited edge metadata.

Ben:

  • Added better source location.
  • Updated testing. I renamed to auto-import-unavailable.cpp and expanded testing as you asked for. Before, I was piggybacking on missing-header.m and related files, but since that is a different code path (and has different requirements and diagnostics), I thought it was better to keep the test separate and self-contained.
  • also emit note_implicit_top_level_module_import_here on missing requirement.

Richard:

  • available-is-better.cpp and the changes to isBetterKnownHeader to take into account availability.

Does this look okay?

Ben: hopefully the present patch is complete enough for you to start
figuring out exactly what hacks to add and where.

I agree that exclude makes more sense for requires excluded as a "supported" feature, but if this is another egregious hack for Darwin, I don't think it matters which we choose.

Sure, I'm just as happy to use textual, since every case I've seen that's the more correct behavior.

(Maybe we should also check the module name here?)

I found another use of this pattern in our Tcl module. I suspect when I look back at older versions of the SDK I'm going to find more of these cases. How about we don't limit this hack to specific modules, but we add a warning to say it's treating these as "textual".

  • Added better source location.

It should really be the requirement location, but I guess we don't store that anywhere. This can be improved separately.

  • Updated testing. I renamed to auto-import-unavailable.cpp and expanded testing as you asked for. Before, I was piggybacking on missing-header.m and related files, but since that is a different code path (and has different requirements and diagnostics), I thought it was better to keep the test separate and self-contained.
  • also emit note_implicit_top_level_module_import_here on missing requirement.

Changes look good, thanks.

I replied to your email about the issue when a "requires" clause comes *after* a header declaration. I didn't actually hit that problem in the wild - only in a synthetic test - so maybe it can be addressed separately. If you and Richard agree, then from my perspective your patch LGTM and we just need to get the Darwin hacks in before it lands.

Ben: hopefully the present patch is complete enough for you to start
figuring out exactly what hacks to add and where.

Definitely. I can work from this, thanks.

rsmith added inline comments.Jul 8 2015, 2:05 PM
lib/Lex/PPDirectives.cpp
1665 ↗(On Diff #29234)

I think this whole block should be moved down to line 1761 or so:

  1. The code currently ensures that it always calls InclusionDirective on the callback object for every #include, even if that include fails.
  2. We don't yet know that the file is actually part of SuggestedModule; it could be in the directory of an umbrella header whose module is unavailable, but it might not be part of that module.
  3. If we somehow got a suggested module but no file, we should not produce additional spurious diagnostics beyond the "file not found" diagnostic we already produced.
test/Modules/Inputs/auto-import-unavailable/module.modulemap
9 ↗(On Diff #29234)

Please add some content to the empty files (even if just a comment) or this test will misbehave on content-addressed file systems, where all the empty files will be treated as the same file.

rsmith added inline comments.Jul 8 2015, 7:13 PM
lib/Lex/PPDirectives.cpp
1665 ↗(On Diff #29234)

One other thing: we should only diagnose when ShouldEnter is false. If we're entering the file anyway, we don't need the module to be available; only those files within it that we're actually entering need to be present.

rsmith added inline comments.Jul 8 2015, 7:16 PM
lib/Lex/PPDirectives.cpp
1665 ↗(On Diff #29234)

Hmm, that's not quite right (ShouldEnter will be false the second time we reach a header with an include guard too). I think the right thing is: don't diagnose this if the suggested module's top-level module is either the current module or the "implementation of" module (that is, if it's a module we'll /always/ enter textually).

silvas added a subscriber: silvas.Jul 9 2015, 3:47 PM

Something weird seems to have been going on with Phab emails and I missed your patch for the header/requires ordering issue. Anyway, I fixed it in r242055 with a similar approach.

silvas updated this revision to Diff 30196.Jul 20 2015, 2:49 PM

Updating based on Richard's feedback.

Richard, is this something like you imagined?

The PPCallbacks is still not being called, like the existing branch
// We hit an error processing the import. Bail out.. The fix will be
similar for both, so this at least isn't making it any harder to fix the
situation.

rsmith added inline comments.Jul 20 2015, 3:04 PM
lib/Lex/PPDirectives.cpp
1680–1698 ↗(On Diff #30196)

The call to loadModule a few lines below already does this. Is this change necessary?

silvas added inline comments.Jul 20 2015, 3:23 PM
lib/Lex/PPDirectives.cpp
1680–1698 ↗(On Diff #30196)

I think it is. We have more information available here and can diagnose the situation better. Ideally the call to loadModule here would take a Module * and could assert the module to be available.

rsmith accepted this revision.Jul 20 2015, 5:14 PM
rsmith edited edge metadata.
rsmith added inline comments.
lib/Lex/PPDirectives.cpp
1680–1698 ↗(On Diff #30196)

OK, that sounds like a good general direction. I don't like the duplication here, but if the plan is to eventually remove it, I'm OK with that.

I'm still concerned about the incomplete umbrella header case, but that really seems like a fundamental design problem with implicit module generation for incomplete umbrella headers -- if we can't build the umbrella header, we can't tell whether a file is part of it, so if the umbrella header module is unavailable, we have to either reject all inclusions of headers in its directory or silently ignore the umbrella header (and silently ignoring modules is exactly what you're preventing with this change).

This revision is now accepted and ready to land.Jul 20 2015, 5:14 PM

Since Richard accepted this, I'll pipe up that I have a patch adding the hacks needed to make this work on Darwin. I need to add some tests and then I'll put it up for review - hopefully tomorrow.

Yeah, I should have said: please wait for Ben to get the Darwin fixes ready before committing :)

Since Richard accepted this, I'll pipe up that I have a patch adding the hacks needed to make this work on Darwin. I need to add some tests and then I'll put it up for review - hopefully tomorrow.

Awesome, thanks! I'll hold off on committing until we have a working set of workarounds ready to roll.

Darwin module map hacks: http://reviews.llvm.org/D11403
We also need to fix clang's _Builting_Intrinsics.arm module. I posted a thread about this to cfe-dev for feedback: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044225.html

This revision was automatically updated to reflect the committed changes.