This is an archive of the discontinued LLVM Phabricator instance.

[clang][modules] Cache loads of modules imported by PCH
ClosedPublic

Authored by jansvoboda11 on Oct 11 2021, 10:15 AM.

Details

Summary

During explicit modular build, PCM files are typically specified via the -fmodule-file=<path> command-line option. Early during the compilation, Clang uses the ASTReader to read their contents and caches the result so that the module isn't loaded implicitly later on. A listener is attached to the ASTReader to collect names of the modules read from the PCM files. However, if the PCM has already been loaded previously via PCH:

  1. the ASTReader doesn't do anything for the second time,
  2. the listener is not invoked at all,
  3. the module load result is not cached,
  4. the compilation fails when attempting to load the module implicitly later on.

This patch solves this problem by attaching the listener to the ASTReader for PCH reading as well.

Diff Detail

Event Timeline

jansvoboda11 requested review of this revision.Oct 11 2021, 10:15 AM
jansvoboda11 created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptOct 11 2021, 10:15 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Note: Another approach to fixing this might be to cache the module load results while loading the PCH too.

Note: Another approach to fixing this might be to cache the module load results while loading the PCH too.

Can you share why you chose this approach instead, and which do you think makes sense long term?

clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
41

Why drop the <name>= argument? Isn't it better to keep that information, rather than forcing the reader to crack open the file to know what's inside?

Note: Another approach to fixing this might be to cache the module load results while loading the PCH too.

Can you share why you chose this approach instead, and which do you think makes sense long term?

I assumed there was a reason this is not being done for PCHs that I can't see.

If that idea seems workable to you, I can give it a try and see if it leads to cleaner code. Conceptually, I think it would make sense to treat both PCHs and PCMs the same in this regard.

It would be nice to have @rsmith's opinion, since I think he originally implemented this.

clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
41

I think it's nice to be consistent. We already pass PCM files in the -fmodule-file=<path> form for:

  • all dependencies when precompiling a module,
  • non-PCH dependencies dependencies when compiling a TU.

The -fmodule-file=<name>=<path> format is handled lazily (done via header search), which might be less efficient (see 0f99d6a4413e3da6ec242c0ab715d6699045dea8). We don't need laziness, since we know we'll need to open the file anyways.

I'm open to revising this decision later as part of performance tuning, but for now, I'd like this to be consistent across the board.

Bonus points: it forces me to fix a Clang bug.

WDYT?

Note: Another approach to fixing this might be to cache the module load results while loading the PCH too.

Can you share why you chose this approach instead, and which do you think makes sense long term?

I assumed there was a reason this is not being done for PCHs that I can't see.

If that idea seems workable to you, I can give it a try and see if it leads to cleaner code. Conceptually, I think it would make sense to treat both PCHs and PCMs the same in this regard.

I can't think of a reason not to do it but maybe I'm not being imaginative enough. I tend to suspect differences between PCH and PCM are incidental, due to flagging interest in PCH once PCMs started to work.

It would be nice to have @rsmith's opinion, since I think he originally implemented this.

clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
41

Okay, SGTM.

Implemented alternative approach: attach the caching listener to PCH reading as well.

jansvoboda11 retitled this revision from [clang][modules] Cache loads of explicit modules imported by PCH to [clang][modules] Cache loads of modules imported by PCH.Oct 12 2021, 3:17 AM
jansvoboda11 edited the summary of this revision. (Show Details)

I implemented the alternative approach, which now seems more straightforward than going through the ModuleManager and working with serialization::ModuleFile.
WDYT @dexonsmith?

dexonsmith accepted this revision.Oct 12 2021, 10:35 AM

LGTM. It'd be nice to split out the move of ReadModuleNames into an NFC prep patch for a cleaner commit history before pushing, but up to you.

This revision is now accepted and ready to land.Oct 12 2021, 10:35 AM

LGTM. It'd be nice to split out the move of ReadModuleNames into an NFC prep patch for a cleaner commit history before pushing, but up to you.

Committed in prep patch: aae776a5341ccf90a2c0a4e2e952ae4ec5ffb422.