This is an archive of the discontinued LLVM Phabricator instance.

[Modules] Move search paths from control block to unhashed control block
AbandonedPublic

Authored by bruno on Aug 30 2019, 10:34 AM.

Details

Summary

Header search paths are currently ignored for the purpose of computing
getModuleHash() during a module build. This means that it doesn't
matter how much one changes header search paths in different clang
invocations, it will always use the same PCM.

This is not really correct but has been mostly "working" ever since.
However, when clang started signing the CONTROL_BLOCK section of the
PCM, which is where the header search paths live in the PCMs, we started
to get unexpected rebuilds when using implicit modules and `signature
mismatch` errors when those are rebuilt in the context of PCHs.

It's inconsistent that we ignore header search paths when selecting a
PCM to build/reuse but consider it as part of the signature of the PCM.
This patch proposes to move serialization of header search paths to
the UNHASHED_CONTROL_BLOCK instead, as we currently do for diagnostics.

Disconsidering the header search paths as part of the signature
shouldn't make it worse for correctness because if the header search
path would change anything, that would change the INPUT_FILE, which
*still is* part of the CONTROL_BLOCK, and therefore will trigger a
module rebuild.

Follow ups from this that might be interesting:

  • Introduce -fmodules-strict-header-search, which does consider search paths in getModuleHash(). This would provide users a way out when they hit bugs related to header search paths, but with the downside of bigger module caches (in case those paths change often in their builds).
  • Optimize the serialization of such paths to only consider the ones that actually contribute to the INPUT_LIST. This has the nice effect that a module only knows about the paths that are actually relevant to it, increase reusability.
  • Add warnings or remarks when the header search paths in the PCM don't match the ones in the compiler.

There are no testcases for this change, as it would require using
llvm-bcanalyzer in clang tests, which we don't currently do.

rdar://problem/45567811

Event Timeline

bruno created this revision.Aug 30 2019, 10:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 30 2019, 10:34 AM
rsmith added a comment.Oct 1 2019, 5:49 PM

This seems reasonable to me, but a test case would be nice.

A mode to validate that header path changes don't change anything would be nice (essentially: redo all the header searches with the new set of paths and new filesystem and make sure they find the same thing), but doesn't seem urgent. (Note that that would catch both the case where the changed header search paths affect behavior and also the case where new files have appeared on an existing header search path and should be found by the header lookup instead of the cached ones.)

bruno abandoned this revision.Nov 5 2020, 7:07 PM