Long story short: I think that it is too dangerous to rely on the global module index for anything but diagnostics.
The global module index is only rebuilt in two cases:
a. When triggered by Sema for diagnosing errors b. At the very end of FrontendAction::Execute()
This patch is deleting a shortcut in ModuleManager::visit() that uses the global module index as a negative cache to determine that it does not make sense to open a module when looking for a specific identifier.
In the test case (see also the PR https://bugs.llvm.org/show_bug.cgi?id=32332) we get different PCM output for B.pcm when
- building B.pcm (which also builds the dependency A.pcm in one go)
- building A.pcm, then building B.pcm in a separate clang invocation
Because of when the global module index is built, (1) does not use the above shortcut, but (2) does. The symbol were this makes a difference is a built-in macro FLT_EPSILON which does not belong to any module but still makes it into the global module index as not being defined in any module. This causes the symbol to be serialized in (1) A.pcm and (2) A.pcm and B.pcm.
I have considered the alternative of hunting down why builtins are serialized into the PCM in the first place, but frankly I feel like this is just the tip of the iceberg, and that relying on the global module index this way is just asking for trouble in concurrent or incremental builds. I have looked at the performance impact of the change, and in the project I built (a large project, ~30min build time) the wall-clock numbers were in the noise.