This is an archive of the discontinued LLVM Phabricator instance.

[clang][deps] Disable implicit module maps
ClosedPublic

Authored by jansvoboda11 on Feb 24 2022, 1:45 AM.

Details

Summary

Since D113473, we don't report any module map files via -fmodule-map-file= in explicit builds. The ultimate goal here is to make sure Clang doesn't open/read/parse/evaluate unnecessary module maps.

However, implicit module maps still end up reading all reachable module maps. This patch disables implicit module maps in explicit builds.

Unfortunately, we still need to report some module map files that aren't encoded in PCM files of dependencies: module maps that are necessary to correctly evaluate includes in modules marked as [no_undeclared_includes].

Depends on D120464.

Diff Detail

Event Timeline

jansvoboda11 requested review of this revision.Feb 24 2022, 1:45 AM
jansvoboda11 created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2022, 1:45 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
dexonsmith added inline comments.Mar 1 2022, 4:48 PM
clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
112

Should this be renamed?

125–126

Can you clarify why this is safe to remove, even though sometimes we do need to add module map files?

269–273

Is there another long-term solution to this that could be pointed at with a FIXME? E.g., could the module map be encoded redundantly here? If so, what else would need to change to make that okay?

275–278

Is there a FIXME to leave behind for tracking the anti-dependencies?

Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 4:48 PM
jansvoboda11 added inline comments.Mar 2 2022, 7:02 AM
clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
112

Yes, that would make sense.

125–126

This collects the module map files of transitive dependencies. We used it to generate -fmodule-map-file= arguments for builds of modules.

However, with explicit modules, it's not necessary to provide -fmodule-map-file= arguments of (transitive) dependencies at all. The module map semantics are basically serialized in the PCM files themselves. That's why this is safe to remove.

The new code below handles a special case ([no_undeclared_includes]) where we still need to generate some -fmodule-map-file= arguments. Semantics of such module maps will not be found in any PCM, since we don't import the modules they describe.

Note that caller of this function used to pass FrontendOptions::ModuleMapFiles member as ModMapPaths. The member is now being initialized in the new code below.

269–273

I'm not sure I understand why this would warrant "long-term solution" or a FIXME. This code handles an existing feature that just happens to be a corner case from the dependency scanning point of view. (You can read up on the feature here.)

What do you mean by encoding the module map redundantly?

275–278

That's a good idea. I left a FIXME in D120464 to specifically serialize the information on anti-dependency introducing module maps here: D120464, but it would make sense to mention it here too.

Review feedback (add TODO & rename function)

jansvoboda11 retitled this revision from [clang][deps] Generate necessary "-fmodule-map-file=" arguments, disable implicit module maps to [clang][deps] Disable implicit module maps.Mar 7 2022, 1:59 AM
dexonsmith added inline comments.Mar 10 2022, 4:07 PM
clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
269–273

Yes, I know the feature.

... but I was probably wrong about how it affects the logic here.

I also now have two guesses at the scenario being handled here (previously I assumed (1)):

  1. A textual include from a module that is not marked as used. I wasn't sure what you meant by "tries to import", but I guess I thought it was just "loads the module map and finds the textual header listed there". IIUC, there's no actual attempt to import a module when the only thing referenced from it is a textual header, but I could see how parsing the module map could affect the state anyway.
  1. A modular include from a module that is not marked as used. Something like a __has_include, or a #include that fails but there's another search path with the same header. In this case, there'd be an actual import attempt, which would fail. And that could also affect state.

Can you clarify which scenario you need to handle? Or is it a 3rd?

I'm not sure I understand why this would warrant "long-term solution" or a FIXME.

This smells like a workaround to me. IIUC, sending in module maps that aren't actually "needed" except to reproduce side effects of failed queries.

My intuition is that the right long-term fix would involve isolate the failed queries from the compiler state in the scanning phase so that they don't have side effects. Then there would be no side effects to reproduce in the explicit build.

What do you mean by encoding the module map redundantly?

I think I was confused about scanning vs building, thinking that a module using a textual include (1) could be copied into each module PCM that directly imports it. (I know that this wouldn't scale right now for various reasons, but I wondered if there was some way to get there... but regardless, seems like it's totally unrelated)

jansvoboda11 added inline comments.Mar 11 2022, 1:27 AM
clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
269–273

The scenario being handled is the following:

  1. Modular #include of B from module A, where A is marked [no_undeclared_includes] and doesn't use B. Typically, that #include would be guarded by __has_include.

With implicit module maps disabled, the presence of module map B allows us to evaluate the __has_include the same way as with them enabled. This is the only reason we need module map B. There are no side effects from failed queries. The query failure itself is the behavior we need to reproduce.

I'm not even thinking about "another search path with the same header" in this patch.

dexonsmith accepted this revision.Mar 11 2022, 12:20 PM

LGTM, with a suggested change for the text of the comment.

clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
269–273

Thanks! I was making things way too complicated!

IIUC:

  • Without the module map, a __has_include would succeed because it'd be textually included.
  • With the module map, a __has_include would fail because the module map would block textual inclusion and [no_undeclared_includes] would block modular inclusion.

Here's some text that I'd suggest:

// We usually don't need to list the module map files of our dependencies when
// building a module explicitly: their semantics will be deserialized from PCM
// files.
//
// However, some module maps loaded implicitly during the dependency scan can
// describe anti-dependencies. That happens when this module, let's call it M1,
// is marked as '[no_undeclared_includes]' and tries to access a header "M2/M2.h"
// from another module, M2, but doesn't have a 'use M2;' declaration. The explicit
// build needs the module map for M2 so that it knows that textually including
// "M2/M2.h" is not allowed. E.g., '__has_include("M2/M2.h")' should return false,
// but without M2's module map the explicit build would return true.
//
// An alternative approach would be to tell the explicit build what its textual
// dependencies are, instead of having it re-discover its anti-dependencies. For
// example, we could create and use an `-ivfs-overlay` with `fall-through: false`
// that explicitly listed the dependencies. However, that's more complicated to
// implement and harder to reason about.

Feel free to reword (and/or drop the __has_include example and/or drop the last paragraph), but something like this would have prevented my confusion!

This revision is now accepted and ready to land.Mar 11 2022, 12:20 PM
This revision was landed with ongoing or failed builds.Mar 12 2022, 2:32 AM
This revision was automatically updated to reflect the committed changes.