This is an archive of the discontinued LLVM Phabricator instance.

[clang][modules] Deserialize included files lazily
ClosedPublic

Authored by jansvoboda11 on Jul 12 2023, 3:33 PM.

Details

Summary

In D114095, HeaderFileInfo::NumIncludes was moved into Preprocessor. This still makes sense, because we want to track this on the granularity of submodules (D112915, D114173), but the way this information is serialized is not ideal. In ASTWriter, the set of included files gets deserialized eagerly, issuing lots of calls to FileManager::getFile() for input files the PCM consumer might not be interested in.

This patch makes the information part of the header file info table, taking advantage of its lazy deserialization which typically happens when a file is about to be included.

Diff Detail

Event Timeline

jansvoboda11 created this revision.Jul 12 2023, 3:33 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2023, 3:33 PM
Herald added a subscriber: ributzka. · View Herald Transcript
jansvoboda11 requested review of this revision.Jul 12 2023, 3:33 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2023, 3:33 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Now that it's not eagerly deserialized, should Preprocessor::alreadyIncluded call HeaderInfo.getFileInfo(File) to ensure the information is up to date? Similarly, we expose the list of files in Preprocessor::getIncludedFiles -- is it okay if this list is incomplete?

clang/lib/Serialization/ASTReader.cpp
1947

Reader.getPreprocessor().markIncluded?

clang/lib/Serialization/ASTWriter.cpp
2545

Can we remove ASTWriter::writeIncludedFiles now?

Now that it's not eagerly deserialized, should Preprocessor::alreadyIncluded call HeaderInfo.getFileInfo(File) to ensure the information is up to date?

It should, but Preprocessor::alreadyIncluded() is only called from HeaderSearch::ShouldEnterIncludeFile() and Preprocessor::HandleHeaderIncludeOrImport(), where HeaderSearch::getFileInfo(File) has already been called. But I agree it would be better to ensure that within Preprocessor::alreadyIncluded() itself. I'll try to include that in the next revision.

Similarly, we expose the list of files in Preprocessor::getIncludedFiles -- is it okay if this list is incomplete?

That should be okay. This function only needs to return files included in the current module compilation, not all transitive includes.

clang/lib/Serialization/ASTReader.cpp
1947

That would trigger infinite recursion, since that calls getFileInfo() which attempts to deserialize.

clang/lib/Serialization/ASTWriter.cpp
2545

Yes, forgot about that, thanks.

Remove dead code, make sure getFileInfo() is called in alreadyIncluded().

benlangmuir accepted this revision.Jul 13 2023, 1:55 PM
This revision is now accepted and ready to land.Jul 13 2023, 1:55 PM
This revision was landed with ongoing or failed builds.Jul 13 2023, 3:00 PM
This revision was automatically updated to reflect the committed changes.