This is an archive of the discontinued LLVM Phabricator instance.

Support lazy stat'ing of files referenced by module maps
ClosedPublic

Authored by rsmith on May 30 2017, 5:00 PM.

Details

Summary

This patch adds support for a header declaration in a module map to specify certain stat information (currently, size and mtime) about that header file. This has two purposes:

  • It removes the need to eagerly stat every file referenced by a module map. Instead, we track a list of unresolved header files with each size / mtime (actually, for simplicity, we track submodules with such headers), and when attempting to look up a header file based on a FileEntry, we check if there are any unresolved header directives with that FileEntry's size / mtime and perform deferred stats if so.
  • It permits a preprocessed module to be compiled without the original files being present on disk. The only reason we used to need those files was to get the stat information in order to do header -> module lookups when using the module. If we're provided with the stat information in the preprocessed module, we can avoid requiring the files to exist.

Unlike most header directives, if a header directive with stat information has no corresponding on-disk file the enclosing module is *not* marked unavailable (so that behavior is consistent regardless of whether we've resolved a header directive, and so that preprocessed modules don't get marked unavailable). We could actually do this for all header directives: the only reason we mark the module unavailable if headers are missing is to give a diagnostic slightly earlier (rather than waiting until we actually try to build the module / load and validate its .pcm file).

Diff Detail

Repository
rL LLVM

Event Timeline

rsmith created this revision.May 30 2017, 5:00 PM
bruno edited edge metadata.Jun 1 2017, 4:45 PM

Hi Richard,

Thanks for improving this further!

Questions / comments:

  • I've noticed in the patch that on the ASTWriter side we serialize the introduced size / mtime, but there are no changes to the ASTReader, so I assume in the reader side you still need the module map around to fetch those infos, right?
  • Does this patch provide any benefits for the case where we don't provide a size / mtime for the header in the module map? Since Size / ModTime are only updated when parsing the module map I would assume no, but I might have missed something.
include/clang/Basic/DiagnosticSerializationKinds.td
193 ↗(On Diff #100799)

Can you add a testcase that test this error?

rsmith added a comment.Jun 1 2017, 5:07 PM
  • I've noticed in the patch that on the ASTWriter side we serialize the introduced size / mtime, but there are no changes to the ASTReader, so I assume in the reader side you still need the module map around to fetch those infos, right?

On the reader side, we already support the case where the module map is absent, and already use the size / mtime as a key for HeaderFileInfo lookups, and when reading the SUBMODULE records we don't actually read the header file list at all, instead taking the relevant information from the HeaderFileInfo lookups. As a consequence, no changes to the reader are required.

  • Does this patch provide any benefits for the case where we don't provide a size / mtime for the header in the module map? Since Size / ModTime are only updated when parsing the module map I would assume no, but I might have missed something.

No, it shouldn't change anything in the case where neither size nor mtime are provided. In those cases, we still eagerly stat the header so we can try to match later #included files against it.

rsmith updated this revision to Diff 101157.Jun 1 2017, 6:24 PM
rsmith marked an inline comment as done.

Rebased and added requested test.

bruno accepted this revision.Jun 1 2017, 6:31 PM

LGTM!

This revision is now accepted and ready to land.Jun 1 2017, 6:31 PM
This revision was automatically updated to reflect the committed changes.