This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Serialization/
-
clang/
-
Serialization/
2/2
ASTWriter.h
-
lib/Serialization/
-
Serialization/
10/10
ASTWriter.cpp
-
test/Modules/
-
Modules/
1
add-remove-irrelevant-module-map.m

Differential D136624

[clang][modules] Account for non-affecting inputs in `ASTWriter`
ClosedPublic

Authored by jansvoboda11 on Oct 24 2022, 10:31 AM.

Download Raw Diff

Details

Reviewers

dexonsmith
Bigcheese
vsapsai
ilyakuteev

Commits

rG6924a49690ee: [clang][modules] Account for non-affecting inputs in `ASTWriter`

Summary

In D106876, we stopped serializing module map files that didn't affect compilation of the current module.

However, since each SourceLocation is simply an offset into SourceManager's global buffer of concatenated input files in, these need to be adjusted during serialization. Otherwise, they can incorrectly point after the buffer or into subsequent input file.

This patch starts adjusting SourceLocations, FileIDs and other SourceManager offsets in ASTWriter.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jansvoboda11 created this revision.Oct 24 2022, 10:31 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 24 2022, 10:31 AM

Herald added subscribers: ributzka, mgrang. · View Herald Transcript

jansvoboda11 requested review of this revision.Oct 24 2022, 10:31 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 24 2022, 10:31 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B193980: Diff 470211.Oct 24 2022, 12:42 PM

This looks reasonable. Have you measured the performance impact of this change?

Bigcheese mentioned this in D136007: [clang][modules][deps] System module maps might not be affecting.Oct 26 2022, 1:45 PM

In D136624#3880526, @Bigcheese wrote:

This looks reasonable. Have you measured the performance impact of this change?

I have done comparison between this patch and https://github.com/apple/llvm-project/pull/5451 (that, instead of leaving non-affecting input files out, serializes one extra bit saying whether they are affecting or not). Both versions carried some additional patches that refine the computation of non-affecting module maps, that I plan to upstream shortly. I used Clang's -ftime-trace on a reasonably large project (1680 TUs, 638 implicitly-built modules). The results say that serialization got 7.09% slower, and deserialization 1.01% slower. The serialization slowdown I understand, but I expected deserialization to get faster, since we now have less of SourceManager to look through. Overall, PCM compilation got 4.35% slower.

I tried optimizing this patch a bit. Instead of creating compact data structure and using binary search to find the preceding non-affecting file, I now store the adjustment information for each FileID in a vector. During deserialization, FileID is simply used as an index into SLocEntryInfos. That didn't yield any measurable improvement in performance, though. I think the regression must be coming from the SourceLocation/Offset to FileID translation.

I don't see any obvious way to work around that. SourceManager::getFileIDLocal() already implements some optimizations that makes accessing nearby offsets fast. A separate SourceManager would avoid this bottleneck, but I'm not sure how much work that would entail (seems substantial).

@Bigcheese LMK if you're fine with the performance implications here.

New version with flat vector + FileID indices; replacing the previous compact representation & binary search
approach

Harbormaster completed remote builds in B194647: Diff 471152.Oct 27 2022, 7:21 AM

In D136624#3888387, @jansvoboda11 wrote:

I tried optimizing this patch a bit. Instead of creating compact data structure and using binary search to find the preceding non-affecting file, I now store the adjustment information for each FileID in a vector. During deserialization, FileID is simply used as an index into SLocEntryInfos. That didn't yield any measurable improvement in performance, though. I think the regression must be coming from the SourceLocation/Offset to FileID translation.

I don't see any obvious way to work around that. SourceManager::getFileIDLocal() already implements some optimizations that makes accessing nearby offsets fast. A separate SourceManager would avoid this bottleneck, but I'm not sure how much work that would entail (seems substantial).

@Bigcheese LMK if you're fine with the performance implications here.

I don't think you need to call getFileID() inside getAdjustment(). There is also an opportunity for a peephole, if getAdjustment() remains expensive.

I've left some comments on the previous version of the patch since it's not obvious to me how to avoid the getFileID() call in the new version.

In D136624#3888097, @jansvoboda11 wrote:

The serialization slowdown I understand, but I expected deserialization to get faster, since we now have less of SourceManager to look through.

Seems worth digging into the deserialization regression. Does the PCM actually get smaller and the ranges more condensed?

One quick test would be to manufacture a situation where two output PCMs would previously have different non-affecting inputs, but now should be bit-for-bit identical. Are they, in fact, bit-for-bit identical? If not, maybe there's something funny to look into...

clang/include/clang/Serialization/ASTWriter.h
449–452	Can you collect a histogram for how big these vectors are? Can we avoid pointer chasing in the common case by making them `SmallVector` of some size during lookup?
clang/lib/Serialization/ASTWriter.cpp
2049–2050	Can we shift this `getAdjustedOffset()` computation to after deciding whether to skip the record?
5293–5294	How often does `getAdjustment()` return the same answer in consecutive calls? If at all common, this would likely benefit from a peephole: Optional<SLocRange> ASTWriter::CachedAdjustmentRange; Optional<UIntTy> ASTWriter::CachedAdjustment; SourceLocation::UIntTy ASTWriter::getAdjustment(SourceLocation::UIntTy Offset) const { // Check for 0. // // How fast is "isLoadedOffset()"? Can/should we add a peephole, or is it just bit // manipulation? (I seem to remember it checking the high bit or something, but if // it's doing some sort of look up, maybe it should be in the slow path so it can // get cached by LastAdjustment.) if (PP->getSourceManager().isLoadedOffset(Offset) \|\| NonAffectingInputs.empty()) return 0; // Check CachedAdjustment. if (CachedAdjustment && CachedAdjustmentRange->includes(Offset)) return *CachedAdjustment; // Call getAdjustmentSlow, which updates CachedAdjustment and CachedAdjustmentRange. // It's out-of-line so that `getAdjustment` can easily be inlined without inlining // the slow path. // // LastAdjustmentRange would be the size of the "gap" between this adjustment // level and the next one (end would be UINTMAX if it's after the last // non-affecting range). return getAdjustmentSlow(Offset); }
5300–5301	Why do you need to call `getFileID()` here? Instead, I would expect this to be a search through a range of offsets (e.g., see my suggestion at https://reviews.llvm.org/D106876#3869247 -- `DroppedMMs` contains SourceLocations, not FileIDs). Two benefits: You don't need to call `getFileID()` to look up an offset. You can merge adjacent non-affecting files (shrinking the search/storage significantly).
clang/test/Modules/non-affecting-module-maps-source-locations.m
32 ↗	(On Diff #470211)	This is exercising the code, but it could do one better and check if the output PCMs are bit-for-bit identical when we (now) expect them to be. Maybe you could do this by having two run lines: one that includes `-I %t/second` and another that doesn't. Then check if the output PCMs are equal.

dexonsmith added inline comments.Oct 27 2022, 8:07 AM

clang/test/Modules/non-affecting-module-maps-source-locations.m
32 ↗	(On Diff #470211)	(Or if the PCM isn't bit-for-bit identical yet, maybe at least the AST block should be...)

Avoid looking up FileID for an offset.

I tried implementing your suggestion (merging ranges of adjacent non-affecting files and avoiding FileID lookups), but the numbers from -ftime-trace are very noisy. I got more stable data by measuring clock cycles and instruction counts, but nothing conclusive yet.

Compilation of CompilerInvocation.cpp with implicit modules.

previous approach with vector + FileID lookup: +0.64% cycles and +1.68% instructions,
current approach with merged SourceRanges: +0.38% cycles and +1.11% instructions.

I'll post here as I experiment more and get more data.

clang/lib/Serialization/ASTWriter.cpp
5300–5301	My reasoning was that if we search through a range of offsets, we're doing conceptually the same thing as `getFileID()` (which already has some optimizations baked in). Maybe the non-affecting files are indeed adjacent and we'll be able to merge most of them. I'll give it a shot and report back.

Harbormaster completed remote builds in B195002: Diff 471641.Oct 28 2022, 2:40 PM

In D136624#3893001, @jansvoboda11 wrote:

I tried implementing your suggestion (merging ranges of adjacent non-affecting files and avoiding FileID lookups), but the numbers from -ftime-trace are very noisy. I got more stable data by measuring clock cycles and instruction counts, but nothing conclusive yet.

Compilation of CompilerInvocation.cpp with implicit modules.

previous approach with vector + FileID lookup: +0.64% cycles and +1.68% instructions,

current approach with merged SourceRanges: +0.38% cycles and +1.11% instructions.

I'll post here as I experiment more and get more data.

Nice; that seems like a bit of an improvement.

I'm curious; are system modules allowed to be non-affecting yet, or are they still assumed to be affecting? (It's the system modules that I think are most likely to be adjacent.)

My intuition is that there is likely some peephole that would be quite effective, that might not be useful for general getFileID() lookups.

I already suggested "same as last lookup?"... I'm curious if that'll help. Maybe that's already in getFileID(), but now that you've factored out that call, it could be useful to replicate.
You could also try: "past the the last non-affecting module?"
You could also try: "before the first non-affecting module?"

I suspect you could collect some data to guide this, such as, for loaded locations (you could ignore "local" locations since they already have a peephole):

Histogram of "loaded" vs. "between" vs. "after" non-affecting modules.
Histogram of "same as last" vs. "same as last-1" vs. "different from last 2".
[...]

Other things that might be useful to know:

What effect is the merging having (or would it have)? (i.e., what's the histogram of "adjacent" non-affecting files? (e.g.: 9 ranges of non-affecting files, with two blocks of 5 files and seven blocks of 1 (which aren't adjacent to any others)))
Is there a change in cycles/instructions when the module cache is hot? (presumably the common case)
Are the PCM artifacts smaller?
Are the PCMs bit-for-bit identical now when a non-affecting module is added to the input? (If not, why not?)
What's the data for implicitly-discovered, explicitly-built modules?

In D136624#3893607, @dexonsmith wrote:

I'm curious; are system modules allowed to be non-affecting yet, or are they still assumed to be affecting? (It's the system modules that I think are most likely to be adjacent.)

Yes, all the measurements are done with system modules allowed to be non-affecting.

My intuition is that there is likely some peephole that would be quite effective, that might not be useful for general getFileID() lookups.

I already suggested "same as last lookup?"... I'm curious if that'll help. Maybe that's already in getFileID(), but now that you've factored out that call, it could be useful to replicate.

I tried that and we seemingly never hit that case. I think that's because these two optimizations take precedence:

You could also try: "past the the last non-affecting module?"

You could also try: "before the first non-affecting module?"

These avoid the binary search for the vast majority of calls to getAdjustment() with local offsets.

I suspect you could collect some data to guide this, such as, for loaded locations (you could ignore "local" locations since they already have a peephole):

Histogram of "loaded" vs. "between" vs. "after" non-affecting modules.

Histogram of "same as last" vs. "same as last-1" vs. "different from last 2".

[...]

Loaded locations make up 68% of all calls to getAdjustment(), so it's good it's the first thing we check for. Around 4% of all calls are for locations before the first non-affecting module. That's the second special case. Around 28% are pointing after the last non-affecting module, and the current revision has a special case for that as well. I think it would make sense to prioritize this check. Only the remaining 0.3% of calls do the actual binary search.

Other things that might be useful to know:

What effect is the merging having (or would it have)? (i.e., what's the histogram of "adjacent" non-affecting files? (e.g.: 9 ranges of non-affecting files, with two blocks of 5 files and seven blocks of 1 (which aren't adjacent to any others)))

Big one. We usually get between 70-100 non-affecting module maps, but they merge into 4-6 consecutive regions.

Is there a change in cycles/instructions when the module cache is hot? (presumably the common case)

I didn't notice this (but didn't look for it specifically). How could that affect performance for PCM writes?

Are the PCM artifacts smaller?

Yes, we leave out the 70-100 SLocEntries. Nothing much changes otherwise.

Are the PCMs bit-for-bit identical now when a non-affecting module is added to the input? (If not, why not?)

They are not in the current revision, but I'll create a patch that makes them so (we also need to adjust the NumCreatedFileIDs).

What's the data for implicitly-discovered, explicitly-built modules?

I didn't measure that. I don't expect much of a difference, though. The scanner has an empty AST, so the number of SourceLocations will be pretty low. (Other parts of the PCM file, like the metadata, don't write much SourceManager offsets.) Even if there was a small regression, I'd be fine with it, since we'll be moving off of implicit build in the scanner. For the explicit builds themselves, this shouldn't affect performance at all. Implicit module map search is disabled, all necessary/affecting module maps are deserialized from the PCM files or provided on the command-line.

jansvoboda11 marked 5 inline comments as done.Nov 1 2022, 1:17 PM

jansvoboda11 added inline comments.

clang/include/clang/Serialization/ASTWriter.h
449–452	Usually 4-6 elements. Making them a `SmallVector<T, 8>` didn't affect performance, though.
clang/lib/Serialization/ASTWriter.cpp
5293–5294	Not that often, see my top-level comment.
5300–5301	This ended up being faster due to merging of non-affecting files. Thanks for the suggestion!
clang/test/Modules/non-affecting-module-maps-source-locations.m
32 ↗	(On Diff #470211)	Yes, I'll probably drop this test entirely and just check the PCM files are bit-for-bit identical when a non-affecting file is not loaded at all.

In D136624#3900051, @jansvoboda11 wrote:

Is there a change in cycles/instructions when the module cache is hot? (presumably the common case)

I didn't notice this (but didn't look for it specifically). How could that affect performance for PCM writes?

It wouldn't check write perf, but it's part of the overall build perf impact. This being neutral (no regression) could help to justify landing the change even when there's a penalty for a cold modules cache. My interest in (implicitly-discovered) explicitly-built modules was similar (if that's neutral, then a regression here is less critical).

Partly, trying to dig into why read speeds got slower. But maybe that was noise that went away though when you switched to cycles/instructions?

Loaded locations make up 68% of all calls to getAdjustment(), so it's good it's the first thing we check for. Around 4% of all calls are for locations before the first non-affecting module. That's the second special case. Around 28% are pointing after the last non-affecting module, and the current revision has a special case for that as well. I think it would make sense to prioritize this check. Only the remaining 0.3% of calls do the actual binary search.

Great; looking forward to seeing new numbers.

(BTW, if you check before/after last non-affecting module, does one of those subsume the is-loaded check entirely? Looking at isLoadedOffset() makes me think it might. If so, maybe you can replace that check with an assertion.)

In D136624#3900051, @jansvoboda11 wrote:

Are the PCMs bit-for-bit identical now when a non-affecting module is added to the input? (If not, why not?)

They are not in the current revision, but I'll create a patch that makes them so (we also need to adjust the NumCreatedFileIDs).

Great. Seems like another motivation to land this change (despite a regression), as it can impact meta-build perf if/when artifacts are being archived/shared in a larger context. Specifically, if the PCM artifacts are more stable, then:

They take up less aggregate space in CAS storage.
Later build steps get more cache hits.

Overall this patch seems like a good step to me; I don't think 1-2% for a single compilation on a cold cache is that bad, assuming a hot cache doesn't regress:

Across a single build, the cache gets hot pretty quickly as most compilations reuse the same modules.
Across incremental builds, most of the cache (at least system modules) will (usually) be hot.
For explicit builds, it sounds like there's no regression anyway.

clang/include/clang/Basic/SourceManager.h
1831–1833 ↗	(On Diff #471641)	The logic for `isLoadedOffset()` suggests that it could maybe be subsumed with "location past the end"?
clang/test/Modules/non-affecting-module-maps-source-locations.m
32 ↗	(On Diff #470211)	That sounds great.

In D136624#3900183, @dexonsmith wrote:

Partly, trying to dig into why read speeds got slower. But maybe that was noise that went away though when you switched to cycles/instructions?
Great; looking forward to seeing new numbers.

Ah, I forgot to mention this. Building the modules is now only 0.2% slower and importing them 1.2% faster (compared to PCMs with all input files serialized).

clang/include/clang/Basic/SourceManager.h
1831–1833 ↗	(On Diff #471641)	I don't think so - we don't want to adjust loaded offsets. Their invariant is that they grow from 2^31 downwards. We do want to adjust local offsets past the last non-affecting file though.

In D136624#3900593, @jansvoboda11 wrote:

Ah, I forgot to mention this. Building the modules is now only 0.2% slower and importing them 1.2% faster (compared to PCMs with all input files serialized).

Awesome. All upside then :).

I added a few nitpick-y comments inline. I can take another look once you've made the PCMs bit-for-bit identical and updated the test (is that happening here, or in a separate review)?

clang/include/clang/Basic/SourceManager.h
1831–1833 ↗	(On Diff #471641)	Oops, right!
clang/lib/Serialization/ASTWriter.cpp
4535	Not sure this comment adds much on top of the code on the next line. A sentence before the `for` loop describing the overall approach might be useful though.
4547–4551	You can reduce nesting by inverting this condition and `continue`.
4559–4560	I'd slightly prefer the comment before the `if`, due to how folding tends to work in editors (see the comment even when the code is folded). Probably relies on dropping the `else` (see below).
4565	I suggest `continue` before the `else` to avoid adding nesting for insertion.

jansvoboda11 mentioned this in D137214: [clang][modules] NFCI: Scaffolding for serialization of adjusted SourceManager offsets.Nov 1 2022, 5:07 PM

jansvoboda11 mentioned this in D137215: [clang] NFC: Extract lower-level SourceManager functions.Nov 1 2022, 5:09 PM

Rebase, decrease nesting, test using diff

Harbormaster completed remote builds in B195598: Diff 472466.Nov 1 2022, 5:28 PM

jansvoboda11 marked 4 inline comments as done.Nov 1 2022, 5:29 PM

LGTM, with one suggestion for the test inline.

clang/test/Modules/add-remove-irrelevant-module-map.m
30	Maybe the CC1s should add `-verify` and `test-simple.m` should have `// expected-no-diagnostics` to help protect against bitrot?

This revision is now accepted and ready to land.Nov 1 2022, 5:37 PM

jansvoboda11 mentioned this in rG0bfc97e4f4eb: [clang][modules] NFCI: Scaffolding for serialization of adjusted SourceManager….Nov 1 2022, 7:11 PM

jansvoboda11 mentioned this in rG9ae6e6a50273: [clang] NFC: Extract lower-level SourceManager functions.Nov 1 2022, 7:18 PM

This revision was landed with ongoing or failed builds.Nov 1 2022, 7:33 PM

Closed by commit rG6924a49690ee: [clang][modules] Account for non-affecting inputs in `ASTWriter` (authored by jansvoboda11). · Explain Why

This revision was automatically updated to reflect the committed changes.

jansvoboda11 added a commit: rG6924a49690ee: [clang][modules] Account for non-affecting inputs in `ASTWriter`.

Revision Contents

Path

Size

clang/

include/

clang/

Serialization/

ASTWriter.h

23 lines

lib/

Serialization/

ASTWriter.cpp

160 lines

test/

Modules/

add-remove-irrelevant-module-map.m

66 lines

Diff 472488

clang/include/clang/Serialization/ASTWriter.h

Show First 20 Lines • Show All 438 Lines • ▼ Show 20 Lines	private:
/// A mapping from each known submodule to its ID number, which will		/// A mapping from each known submodule to its ID number, which will
/// be a positive integer.		/// be a positive integer.
llvm::DenseMap<const Module *, unsigned> SubmoduleIDs;		llvm::DenseMap<const Module *, unsigned> SubmoduleIDs;

/// A list of the module file extension writers.		/// A list of the module file extension writers.
std::vector<std::unique_ptr<ModuleFileExtensionWriter>>		std::vector<std::unique_ptr<ModuleFileExtensionWriter>>
ModuleFileExtensionWriters;		ModuleFileExtensionWriters;

/// User ModuleMaps skipped when writing control block.		/// Mapping from a source location entry to whether it is affecting or not.
std::set<const FileEntry *> SkippedModuleMaps;		llvm::BitVector IsSLocAffecting;

		/// Mapping from \c FileID to an index into the FileID adjustment table.
		std::vector<FileID> NonAffectingFileIDs;
		std::vector<unsigned> NonAffectingFileIDAdjustments;
		dexonsmithUnsubmitted Done Reply Inline Actions Can you collect a histogram for how big these vectors are? Can we avoid pointer chasing in the common case by making them `SmallVector` of some size during lookup? dexonsmith: Can you collect a histogram for how big these vectors are? Can we avoid pointer chasing in the…
		jansvoboda11AuthorUnsubmitted Done Reply Inline Actions Usually 4-6 elements. Making them a `SmallVector<T, 8>` didn't affect performance, though. jansvoboda11: Usually 4-6 elements. Making them a `SmallVector<T, 8>` didn't affect performance, though.

		/// Mapping from an offset to an index into the offset adjustment table.
		std::vector<SourceRange> NonAffectingRanges;
		std::vector<SourceLocation::UIntTy> NonAffectingOffsetAdjustments;

		/// Collects input files that didn't affect compilation of the current module,
		/// and initializes data structures necessary for leaving those files out
		/// during \c SourceManager serialization.
		void collectNonAffectingInputFiles();

/// Returns an adjusted \c FileID, accounting for any non-affecting input		/// Returns an adjusted \c FileID, accounting for any non-affecting input
/// files.		/// files.
FileID getAdjustedFileID(FileID FID) const;		FileID getAdjustedFileID(FileID FID) const;
/// Returns an adjusted number of \c FileIDs created within the specified \c		/// Returns an adjusted number of \c FileIDs created within the specified \c
/// FileID, accounting for any non-affecting input files.		/// FileID, accounting for any non-affecting input files.
unsigned getAdjustedNumCreatedFIDs(FileID FID) const;		unsigned getAdjustedNumCreatedFIDs(FileID FID) const;
/// Returns an adjusted \c SourceLocation, accounting for any non-affecting		/// Returns an adjusted \c SourceLocation, accounting for any non-affecting
/// input files.		/// input files.
SourceLocation getAdjustedLocation(SourceLocation Loc) const;		SourceLocation getAdjustedLocation(SourceLocation Loc) const;
/// Returns an adjusted \c SourceRange, accounting for any non-affecting input		/// Returns an adjusted \c SourceRange, accounting for any non-affecting input
/// files.		/// files.
SourceRange getAdjustedRange(SourceRange Range) const;		SourceRange getAdjustedRange(SourceRange Range) const;
/// Returns an adjusted \c SourceLocation offset, accounting for any		/// Returns an adjusted \c SourceLocation offset, accounting for any
/// non-affecting input files.		/// non-affecting input files.
SourceLocation::UIntTy getAdjustedOffset(SourceLocation::UIntTy Offset) const;		SourceLocation::UIntTy getAdjustedOffset(SourceLocation::UIntTy Offset) const;
		/// Returns an adjustment for offset into SourceManager, accounting for any
		/// non-affecting input files.
		SourceLocation::UIntTy getAdjustment(SourceLocation::UIntTy Offset) const;

/// Retrieve or create a submodule ID for this module.		/// Retrieve or create a submodule ID for this module.
unsigned getSubmoduleID(Module *Mod);		unsigned getSubmoduleID(Module *Mod);

/// Write the given subexpression to the bitstream.		/// Write the given subexpression to the bitstream.
void WriteSubStmt(Stmt *S);		void WriteSubStmt(Stmt *S);

void WriteBlockInfoBlock();		void WriteBlockInfoBlock();
void WriteControlBlock(Preprocessor &PP, ASTContext &Context,		void WriteControlBlock(Preprocessor &PP, ASTContext &Context,
StringRef isysroot);		StringRef isysroot);

/// Write out the signature and diagnostic options, and return the signature.		/// Write out the signature and diagnostic options, and return the signature.
ASTFileSignature writeUnhashedControlBlock(Preprocessor &PP,		ASTFileSignature writeUnhashedControlBlock(Preprocessor &PP,
ASTContext &Context);		ASTContext &Context);

/// Calculate hash of the pcm content.		/// Calculate hash of the pcm content.
static std::pair<ASTFileSignature, ASTFileSignature>		static std::pair<ASTFileSignature, ASTFileSignature>
createSignature(StringRef AllBytes, StringRef ASTBlockBytes);		createSignature(StringRef AllBytes, StringRef ASTBlockBytes);

void WriteInputFiles(SourceManager &SourceMgr, HeaderSearchOptions &HSOpts,		void WriteInputFiles(SourceManager &SourceMgr, HeaderSearchOptions &HSOpts);
std::set<const FileEntry *> &AffectingClangModuleMaps);
void WriteSourceManagerBlock(SourceManager &SourceMgr,		void WriteSourceManagerBlock(SourceManager &SourceMgr,
const Preprocessor &PP);		const Preprocessor &PP);
void writeIncludedFiles(raw_ostream &Out, const Preprocessor &PP);		void writeIncludedFiles(raw_ostream &Out, const Preprocessor &PP);
void WritePreprocessor(const Preprocessor &PP, bool IsModule);		void WritePreprocessor(const Preprocessor &PP, bool IsModule);
void WriteHeaderSearch(const HeaderSearch &HS);		void WriteHeaderSearch(const HeaderSearch &HS);
void WritePreprocessorDetail(PreprocessingRecord &PPRec,		void WritePreprocessorDetail(PreprocessingRecord &PPRec,
uint64_t MacroOffsetsBase);		uint64_t MacroOffsetsBase);
void WriteSubmodules(Module *WritingModule);		void WriteSubmodules(Module *WritingModule);
▲ Show 20 Lines • Show All 317 Lines • Show Last 20 Lines

clang/lib/Serialization/ASTWriter.cpp

Show First 20 Lines • Show All 155 Lines • ▼ Show 20 Lines	#include "clang/Serialization/TypeBitCodes.def"
case Type::Builtin:		case Type::Builtin:
llvm_unreachable("shouldn't be serializing a builtin type this way");		llvm_unreachable("shouldn't be serializing a builtin type this way");
}		}
llvm_unreachable("bad type kind");		llvm_unreachable("bad type kind");
}		}

namespace {		namespace {

std::set<const FileEntry *> GetAllModuleMaps(const HeaderSearch &HS,		std::set<const FileEntry *> GetAffectingModuleMaps(const HeaderSearch &HS,
Module *RootModule) {		Module *RootModule) {
std::set<const FileEntry *> ModuleMaps{};		std::set<const FileEntry *> ModuleMaps{};
std::set<const Module *> ProcessedModules;		std::set<const Module *> ProcessedModules;
SmallVector<const Module *> ModulesToProcess{RootModule};		SmallVector<const Module *> ModulesToProcess{RootModule};

SmallVector<const FileEntry *, 16> FilesByUID;		SmallVector<const FileEntry *, 16> FilesByUID;
HS.getFileMgr().GetUniqueIDMapping(FilesByUID);		HS.getFileMgr().GetUniqueIDMapping(FilesByUID);

if (FilesByUID.size() > HS.header_file_size())		if (FilesByUID.size() > HS.header_file_size())
▲ Show 20 Lines • Show All 1,298 Lines • ▼ Show 20 Lines	if (const FileEntry *MainFile = SM.getFileEntryForID(SM.getMainFileID())) {
AddFileID(SM.getMainFileID(), Record);		AddFileID(SM.getMainFileID(), Record);
EmitRecordWithPath(FileAbbrevCode, Record, MainFile->getName());		EmitRecordWithPath(FileAbbrevCode, Record, MainFile->getName());
}		}

Record.clear();		Record.clear();
AddFileID(SM.getMainFileID(), Record);		AddFileID(SM.getMainFileID(), Record);
Stream.EmitRecord(ORIGINAL_FILE_ID, Record);		Stream.EmitRecord(ORIGINAL_FILE_ID, Record);

std::set<const FileEntry *> AffectingClangModuleMaps;
if (WritingModule) {
AffectingClangModuleMaps =
GetAllModuleMaps(PP.getHeaderSearchInfo(), WritingModule);
}

WriteInputFiles(Context.SourceMgr,		WriteInputFiles(Context.SourceMgr,
PP.getHeaderSearchInfo().getHeaderSearchOpts(),		PP.getHeaderSearchInfo().getHeaderSearchOpts());
AffectingClangModuleMaps);
Stream.ExitBlock();		Stream.ExitBlock();
}		}

namespace {		namespace {

/// An input file.		/// An input file.
struct InputFileEntry {		struct InputFileEntry {
const FileEntry *File;		const FileEntry *File;
bool IsSystemFile;		bool IsSystemFile;
bool IsTransient;		bool IsTransient;
bool BufferOverridden;		bool BufferOverridden;
bool IsTopLevelModuleMap;		bool IsTopLevelModuleMap;
uint32_t ContentHash[2];		uint32_t ContentHash[2];
};		};

} // namespace		} // namespace

void ASTWriter::WriteInputFiles(		void ASTWriter::WriteInputFiles(SourceManager &SourceMgr,
SourceManager &SourceMgr, HeaderSearchOptions &HSOpts,		HeaderSearchOptions &HSOpts) {
std::set<const FileEntry *> &AffectingClangModuleMaps) {
using namespace llvm;		using namespace llvm;

Stream.EnterSubblock(INPUT_FILES_BLOCK_ID, 4);		Stream.EnterSubblock(INPUT_FILES_BLOCK_ID, 4);

// Create input-file abbreviation.		// Create input-file abbreviation.
auto IFAbbrev = std::make_shared<BitCodeAbbrev>();		auto IFAbbrev = std::make_shared<BitCodeAbbrev>();
IFAbbrev->Add(BitCodeAbbrevOp(INPUT_FILE));		IFAbbrev->Add(BitCodeAbbrevOp(INPUT_FILE));
IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // ID		IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // ID
Show All 23 Lines	for (unsigned I = 1, N = SourceMgr.local_sloc_entry_size(); I != N; ++I) {
// We only care about file entries that were not overridden.		// We only care about file entries that were not overridden.
if (!SLoc->isFile())		if (!SLoc->isFile())
continue;		continue;
const SrcMgr::FileInfo &File = SLoc->getFile();		const SrcMgr::FileInfo &File = SLoc->getFile();
const SrcMgr::ContentCache *Cache = &File.getContentCache();		const SrcMgr::ContentCache *Cache = &File.getContentCache();
if (!Cache->OrigEntry)		if (!Cache->OrigEntry)
continue;		continue;

if (isModuleMap(File.getFileCharacteristic()) &&		// Do not emit input files that do not affect current module.
!isSystem(File.getFileCharacteristic()) &&		if (!IsSLocAffecting[I])
!AffectingClangModuleMaps.empty() &&
AffectingClangModuleMaps.find(Cache->OrigEntry) ==
AffectingClangModuleMaps.end()) {
SkippedModuleMaps.insert(Cache->OrigEntry);
// Do not emit modulemaps that do not affect current module.
continue;		continue;
}

InputFileEntry Entry;		InputFileEntry Entry;
Entry.File = Cache->OrigEntry;		Entry.File = Cache->OrigEntry;
Entry.IsSystemFile = isSystem(File.getFileCharacteristic());		Entry.IsSystemFile = isSystem(File.getFileCharacteristic());
Entry.IsTransient = Cache->IsTransient;		Entry.IsTransient = Cache->IsTransient;
Entry.BufferOverridden = Cache->BufferOverridden;		Entry.BufferOverridden = Cache->BufferOverridden;
Entry.IsTopLevelModuleMap = isModuleMap(File.getFileCharacteristic()) &&		Entry.IsTopLevelModuleMap = isModuleMap(File.getFileCharacteristic()) &&
File.getIncludeLoc().isInvalid();		File.getIncludeLoc().isInvalid();
▲ Show 20 Lines • Show All 490 Lines • ▼ Show 20 Lines	if (SLoc->isFile()) {
if (Cache->OrigEntry) {		if (Cache->OrigEntry) {
Code = SM_SLOC_FILE_ENTRY;		Code = SM_SLOC_FILE_ENTRY;
} else		} else
Code = SM_SLOC_BUFFER_ENTRY;		Code = SM_SLOC_BUFFER_ENTRY;
} else		} else
Code = SM_SLOC_EXPANSION_ENTRY;		Code = SM_SLOC_EXPANSION_ENTRY;
Record.clear();		Record.clear();
Record.push_back(Code);		Record.push_back(Code);

if (SLoc->isFile()) {		if (SLoc->isFile()) {
		dexonsmithUnsubmitted Done Reply Inline Actions Can we shift this `getAdjustedOffset()` computation to after deciding whether to skip the record? dexonsmith: Can we shift this `getAdjustedOffset()` computation to after deciding whether to skip the…
const SrcMgr::FileInfo &File = SLoc->getFile();		const SrcMgr::FileInfo &File = SLoc->getFile();
const SrcMgr::ContentCache *Content = &File.getContentCache();		const SrcMgr::ContentCache *Content = &File.getContentCache();
if (Content->OrigEntry && !SkippedModuleMaps.empty() &&
SkippedModuleMaps.find(Content->OrigEntry) !=
SkippedModuleMaps.end()) {
// Do not emit files that were not listed as inputs.		// Do not emit files that were not listed as inputs.
		if (!IsSLocAffecting[I])
continue;		continue;
}
SLocEntryOffsets.push_back(Offset);		SLocEntryOffsets.push_back(Offset);
// Starting offset of this entry within this module, so skip the dummy.		// Starting offset of this entry within this module, so skip the dummy.
Record.push_back(getAdjustedOffset(SLoc->getOffset()) - 2);		Record.push_back(getAdjustedOffset(SLoc->getOffset()) - 2);
AddSourceLocation(File.getIncludeLoc(), Record);		AddSourceLocation(File.getIncludeLoc(), Record);
Record.push_back(File.getFileCharacteristic()); // FIXME: stable encoding		Record.push_back(File.getFileCharacteristic()); // FIXME: stable encoding
Record.push_back(File.hasLineDirectives());		Record.push_back(File.hasLineDirectives());

bool EmitBlob = false;		bool EmitBlob = false;
▲ Show 20 Lines • Show All 2,441 Lines • ▼ Show 20 Lines
static void AddLazyVectorDecls(ASTWriter &Writer, Vector &Vec,		static void AddLazyVectorDecls(ASTWriter &Writer, Vector &Vec,
ASTWriter::RecordData &Record) {		ASTWriter::RecordData &Record) {
for (typename Vector::iterator I = Vec.begin(nullptr, true), E = Vec.end();		for (typename Vector::iterator I = Vec.begin(nullptr, true), E = Vec.end();
I != E; ++I) {		I != E; ++I) {
Writer.AddDeclRef(*I, Record);		Writer.AddDeclRef(*I, Record);
}		}
}		}

		void ASTWriter::collectNonAffectingInputFiles() {
		SourceManager &SrcMgr = PP->getSourceManager();
		unsigned N = SrcMgr.local_sloc_entry_size();

		IsSLocAffecting.resize(N, true);

		if (!WritingModule)
		return;

		auto AffectingModuleMaps =
		GetAffectingModuleMaps(PP->getHeaderSearchInfo(), WritingModule);

		unsigned FileIDAdjustment = 0;
		unsigned OffsetAdjustment = 0;

		NonAffectingFileIDAdjustments.reserve(N);
		NonAffectingOffsetAdjustments.reserve(N);

		NonAffectingFileIDAdjustments.push_back(FileIDAdjustment);
		NonAffectingOffsetAdjustments.push_back(OffsetAdjustment);

		for (unsigned I = 1; I != N; ++I) {
		const SrcMgr::SLocEntry *SLoc = &SrcMgr.getLocalSLocEntry(I);
		dexonsmithUnsubmitted Done Reply Inline Actions Not sure this comment adds much on top of the code on the next line. A sentence before the `for` loop describing the overall approach might be useful though. dexonsmith: Not sure this comment adds much on top of the code on the next line. A sentence before the…
		FileID FID = FileID::get(I);
		assert(&SrcMgr.getSLocEntry(FID) == SLoc);

		if (!SLoc->isFile())
		continue;
		const SrcMgr::FileInfo &File = SLoc->getFile();
		const SrcMgr::ContentCache *Cache = &File.getContentCache();
		if (!Cache->OrigEntry)
		continue;

		if (!isModuleMap(File.getFileCharacteristic()) \|\|
		isSystem(File.getFileCharacteristic()) \|\| AffectingModuleMaps.empty() \|\|
		AffectingModuleMaps.find(Cache->OrigEntry) != AffectingModuleMaps.end())
		continue;

		IsSLocAffecting[I] = false;
		dexonsmithUnsubmitted Done Reply Inline Actions You can reduce nesting by inverting this condition and `continue`. dexonsmith: You can reduce nesting by inverting this condition and `continue`.

		FileIDAdjustment += 1;
		// Even empty files take up one element in the offset table.
		OffsetAdjustment += SrcMgr.getFileIDSize(FID) + 1;

		// If the previous file was non-affecting as well, just extend its entry
		// with our information.
		if (!NonAffectingFileIDs.empty() &&
		NonAffectingFileIDs.back().ID == FID.ID - 1) {
		dexonsmithUnsubmitted Done Reply Inline Actions I'd slightly prefer the comment before the `if`, due to how folding tends to work in editors (see the comment even when the code is folded). Probably relies on dropping the `else` (see below). dexonsmith: I'd slightly prefer the comment before the `if`, due to how folding tends to work in editors…
		NonAffectingFileIDs.back() = FID;
		NonAffectingRanges.back().setEnd(SrcMgr.getLocForEndOfFile(FID));
		NonAffectingFileIDAdjustments.back() = FileIDAdjustment;
		NonAffectingOffsetAdjustments.back() = OffsetAdjustment;
		continue;
		dexonsmithUnsubmitted Done Reply Inline Actions I suggest `continue` before the `else` to avoid adding nesting for insertion. dexonsmith: I suggest `continue` before the `else` to avoid adding nesting for insertion.
		}

		NonAffectingFileIDs.push_back(FID);
		NonAffectingRanges.emplace_back(SrcMgr.getLocForStartOfFile(FID),
		SrcMgr.getLocForEndOfFile(FID));
		NonAffectingFileIDAdjustments.push_back(FileIDAdjustment);
		NonAffectingOffsetAdjustments.push_back(OffsetAdjustment);
		}
		}

ASTFileSignature ASTWriter::WriteASTCore(Sema &SemaRef, StringRef isysroot,		ASTFileSignature ASTWriter::WriteASTCore(Sema &SemaRef, StringRef isysroot,
Module *WritingModule) {		Module *WritingModule) {
using namespace llvm;		using namespace llvm;

bool isModule = WritingModule != nullptr;		bool isModule = WritingModule != nullptr;

// Make sure that the AST reader knows to finalize itself.		// Make sure that the AST reader knows to finalize itself.
if (Chain)		if (Chain)
Chain->finalizeForWriting();		Chain->finalizeForWriting();

ASTContext &Context = SemaRef.Context;		ASTContext &Context = SemaRef.Context;
Preprocessor &PP = SemaRef.PP;		Preprocessor &PP = SemaRef.PP;

		collectNonAffectingInputFiles();

// Set up predefined declaration IDs.		// Set up predefined declaration IDs.
auto RegisterPredefDecl = [&] (Decl *D, PredefinedDeclIDs ID) {		auto RegisterPredefDecl = [&] (Decl *D, PredefinedDeclIDs ID) {
if (D) {		if (D) {
assert(D->isCanonicalDecl() && "predefined decl is not canonical");		assert(D->isCanonicalDecl() && "predefined decl is not canonical");
DeclIDs[D] = ID;		DeclIDs[D] = ID;
}		}
};		};
RegisterPredefDecl(Context.getTranslationUnitDecl(),		RegisterPredefDecl(Context.getTranslationUnitDecl(),
▲ Show 20 Lines • Show All 673 Lines • ▼ Show 20 Lines

void ASTWriter::AddAlignPackInfo(const Sema::AlignPackInfo &Info,		void ASTWriter::AddAlignPackInfo(const Sema::AlignPackInfo &Info,
RecordDataImpl &Record) {		RecordDataImpl &Record) {
uint32_t Raw = Sema::AlignPackInfo::getRawEncoding(Info);		uint32_t Raw = Sema::AlignPackInfo::getRawEncoding(Info);
Record.push_back(Raw);		Record.push_back(Raw);
}		}

FileID ASTWriter::getAdjustedFileID(FileID FID) const {		FileID ASTWriter::getAdjustedFileID(FileID FID) const {
// TODO: Actually adjust this.		if (FID.isInvalid() \|\| PP->getSourceManager().isLoadedFileID(FID) \|\|
		NonAffectingFileIDs.empty())
return FID;		return FID;
		auto It = llvm::lower_bound(NonAffectingFileIDs, FID);
		unsigned Idx = std::distance(NonAffectingFileIDs.begin(), It);
		unsigned Offset = NonAffectingFileIDAdjustments[Idx];
		return FileID::get(FID.getOpaqueValue() - Offset);
}		}

unsigned ASTWriter::getAdjustedNumCreatedFIDs(FileID FID) const {		unsigned ASTWriter::getAdjustedNumCreatedFIDs(FileID FID) const {
// TODO: Actually adjust this.		unsigned NumCreatedFIDs = PP->getSourceManager()
return PP->getSourceManager()
.getLocalSLocEntry(FID.ID)		.getLocalSLocEntry(FID.ID)
.getFile()		.getFile()
.NumCreatedFIDs;		.NumCreatedFIDs;

		dexonsmithUnsubmitted Done Reply Inline Actions How often does `getAdjustment()` return the same answer in consecutive calls? If at all common, this would likely benefit from a peephole: Optional<SLocRange> ASTWriter::CachedAdjustmentRange; Optional<UIntTy> ASTWriter::CachedAdjustment; SourceLocation::UIntTy ASTWriter::getAdjustment(SourceLocation::UIntTy Offset) const { // Check for 0. // // How fast is "isLoadedOffset()"? Can/should we add a peephole, or is it just bit // manipulation? (I seem to remember it checking the high bit or something, but if // it's doing some sort of look up, maybe it should be in the slow path so it can // get cached by LastAdjustment.) if (PP->getSourceManager().isLoadedOffset(Offset) \|\| NonAffectingInputs.empty()) return 0; // Check CachedAdjustment. if (CachedAdjustment && CachedAdjustmentRange->includes(Offset)) return CachedAdjustment; // Call getAdjustmentSlow, which updates CachedAdjustment and CachedAdjustmentRange. // It's out-of-line so that `getAdjustment` can easily be inlined without inlining // the slow path. // // LastAdjustmentRange would be the size of the "gap" between this adjustment // level and the next one (end would be UINTMAX if it's after the last // non-affecting range). return getAdjustmentSlow(Offset); } dexonsmith:* How often does `getAdjustment()` return the same answer in consecutive calls? If at all common…
		jansvoboda11AuthorUnsubmitted Done Reply Inline Actions Not that often, see my top-level comment. jansvoboda11: Not that often, see my top-level comment.
		unsigned AdjustedNumCreatedFIDs = 0;
		for (unsigned I = FID.ID, N = I + NumCreatedFIDs; I != N; ++I)
		if (IsSLocAffecting[I])
		++AdjustedNumCreatedFIDs;
		return AdjustedNumCreatedFIDs;
}		}

		dexonsmithUnsubmitted Done Reply Inline Actions Why do you need to call `getFileID()` here? Instead, I would expect this to be a search through a range of offsets (e.g., see my suggestion at https://reviews.llvm.org/D106876#3869247 -- `DroppedMMs` contains SourceLocations, not FileIDs). Two benefits: You don't need to call `getFileID()` to look up an offset. You can merge adjacent non-affecting files (shrinking the search/storage significantly). dexonsmith: Why do you need to call `getFileID()` here? Instead, I would expect this to be a search…
		jansvoboda11AuthorUnsubmitted Done Reply Inline Actions My reasoning was that if we search through a range of offsets, we're doing conceptually the same thing as `getFileID()` (which already has some optimizations baked in). Maybe the non-affecting files are indeed adjacent and we'll be able to merge most of them. I'll give it a shot and report back. jansvoboda11: My reasoning was that if we search through a range of offsets, we're doing conceptually the…
		jansvoboda11AuthorUnsubmitted Done Reply Inline Actions This ended up being faster due to merging of non-affecting files. Thanks for the suggestion! jansvoboda11: This ended up being faster due to merging of non-affecting files. Thanks for the suggestion!
SourceLocation ASTWriter::getAdjustedLocation(SourceLocation Loc) const {		SourceLocation ASTWriter::getAdjustedLocation(SourceLocation Loc) const {
// TODO: Actually adjust this.		if (Loc.isInvalid())
return Loc;		return Loc;
		return Loc.getLocWithOffset(-getAdjustment(Loc.getOffset()));
}		}

SourceRange ASTWriter::getAdjustedRange(SourceRange Range) const {		SourceRange ASTWriter::getAdjustedRange(SourceRange Range) const {
// TODO: Actually adjust this.		return SourceRange(getAdjustedLocation(Range.getBegin()),
return Range;		getAdjustedLocation(Range.getEnd()));
}		}

SourceLocation::UIntTy		SourceLocation::UIntTy
ASTWriter::getAdjustedOffset(SourceLocation::UIntTy Offset) const {		ASTWriter::getAdjustedOffset(SourceLocation::UIntTy Offset) const {
// TODO: Actually adjust this.		return Offset - getAdjustment(Offset);
return Offset;		}

		SourceLocation::UIntTy
		ASTWriter::getAdjustment(SourceLocation::UIntTy Offset) const {
		if (NonAffectingRanges.empty())
		return 0;

		if (PP->getSourceManager().isLoadedOffset(Offset))
		return 0;

		if (Offset > NonAffectingRanges.back().getEnd().getOffset())
		return NonAffectingOffsetAdjustments.back();

		if (Offset < NonAffectingRanges.front().getBegin().getOffset())
		return 0;

		auto Contains = [](const SourceRange &Range, SourceLocation::UIntTy Offset) {
		return Range.getEnd().getOffset() < Offset;
		};

		auto It = llvm::lower_bound(NonAffectingRanges, Offset, Contains);
		unsigned Idx = std::distance(NonAffectingRanges.begin(), It);
		return NonAffectingOffsetAdjustments[Idx];
}		}

void ASTWriter::AddFileID(FileID FID, RecordDataImpl &Record) {		void ASTWriter::AddFileID(FileID FID, RecordDataImpl &Record) {
Record.push_back(getAdjustedFileID(FID).getOpaqueValue());		Record.push_back(getAdjustedFileID(FID).getOpaqueValue());
}		}

void ASTWriter::AddSourceLocation(SourceLocation Loc, RecordDataImpl &Record,		void ASTWriter::AddSourceLocation(SourceLocation Loc, RecordDataImpl &Record,
SourceLocationSequence *Seq) {		SourceLocationSequence *Seq) {
▲ Show 20 Lines • Show All 246 Lines • ▼ Show 20 Lines	void ASTWriter::associateDeclWithFile(const Decl *D, DeclID ID) {
SourceLocation FileLoc = SM.getFileLoc(Loc);		SourceLocation FileLoc = SM.getFileLoc(Loc);
assert(SM.isLocalSourceLocation(FileLoc));		assert(SM.isLocalSourceLocation(FileLoc));
FileID FID;		FileID FID;
unsigned Offset;		unsigned Offset;
std::tie(FID, Offset) = SM.getDecomposedLoc(FileLoc);		std::tie(FID, Offset) = SM.getDecomposedLoc(FileLoc);
if (FID.isInvalid())		if (FID.isInvalid())
return;		return;
assert(SM.getSLocEntry(FID).isFile());		assert(SM.getSLocEntry(FID).isFile());
		assert(IsSLocAffecting[FID.ID]);

std::unique_ptr<DeclIDInFileInfo> &Info = FileDeclIDs[FID];		std::unique_ptr<DeclIDInFileInfo> &Info = FileDeclIDs[FID];
if (!Info)		if (!Info)
Info = std::make_unique<DeclIDInFileInfo>();		Info = std::make_unique<DeclIDInFileInfo>();

std::pair<unsigned, serialization::DeclID> LocDecl(Offset, ID);		std::pair<unsigned, serialization::DeclID> LocDecl(Offset, ID);
LocDeclIDsTy &Decls = Info->DeclIDs;		LocDeclIDsTy &Decls = Info->DeclIDs;
Decls.push_back(LocDecl);		Decls.push_back(LocDecl);
▲ Show 20 Lines • Show All 1,460 Lines • Show Last 20 Lines

clang/test/Modules/add-remove-irrelevant-module-map.m

	// RUN: rm -rf %t && mkdir %t			// RUN: rm -rf %t && mkdir %t
	// RUN: split-file %s %t			// RUN: split-file %s %t

	//--- a.modulemap			//--- a/module.modulemap
	module a {}			module a {}

	//--- b.modulemap			//--- b/module.modulemap
	module b {}			module b {}

	//--- test-simple.m			//--- c/module.modulemap
	// expected-no-diagnostics			module c {}
	@import a;

	// Build without b.modulemap:
	//
	// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache -fdisable-module-hash \
	// RUN: -fmodule-map-file=%t/a.modulemap %t/test-simple.m -verify
	// RUN: mv %t/cache %t/cache-without-b

	// Build with b.modulemap:
	//
	// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache -fdisable-module-hash \
	// RUN: -fmodule-map-file=%t/a.modulemap -fmodule-map-file=%t/b.modulemap %t/test-simple.m -verify
	// RUN: mv %t/cache %t/cache-with-b

	// Neither PCM file considers 'b.modulemap' an input:
	//
	// RUN: %clang_cc1 -module-file-info %t/cache-without-b/a.pcm \| FileCheck %s --check-prefix=CHECK-B
	// RUN: %clang_cc1 -module-file-info %t/cache-with-b/a.pcm \| FileCheck %s --check-prefix=CHECK-B
	// CHECK-B-NOT: Input file: {{.*}}b.modulemap

	//--- c.modulemap
	module c [no_undeclared_includes] { header "c.h" }

	//--- c.h
	#if __has_include("d.h") // This should use 'd.modulemap' in order to determine that 'd.h'
	// doesn't exist for 'c' because of its '[no_undeclared_includes]'.
	#endif

	//--- d.modulemap
	module d { header "d.h" }

	//--- d.h			//--- module.modulemap
	// empty			module m { header "m.h" }
				//--- m.h
				@import c;

	//--- test-no-undeclared-includes.m			//--- test-simple.m
	// expected-no-diagnostics			// expected-no-diagnostics
	@import c;			@import m;

				// Build modules with the non-affecting "a/module.modulemap".
				// RUN: %clang_cc1 -I %t/a -I %t/b -I %t/c -I %t -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache -fdisable-module-hash %t/test-simple.m -verify
				// RUN: mv %t/cache %t/cache-with

				// Build modules without the non-affecting "a/module.modulemap".
				// RUN: rm -rf %t/a/module.modulemap
				// RUN: %clang_cc1 -I %t/a -I %t/b -I %t/c -I %t -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache -fdisable-module-hash %t/test-simple.m -verify
				// RUN: mv %t/cache %t/cache-without

				dexonsmithUnsubmitted Not Done Reply Inline Actions Maybe the CC1s should add `-verify` and `test-simple.m` should have `// expected-no-diagnostics` to help protect against bitrot? dexonsmith: Maybe the CC1s should add `-verify` and `test-simple.m` should have `// expected-no…
	// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache -fdisable-module-hash \			// Check that the PCM files are bit-for-bit identical.
	// RUN: -fmodule-map-file=%t/c.modulemap -fmodule-map-file=%t/d.modulemap \			// RUN: diff %t/cache-with/m.pcm %t/cache-without/m.pcm
	// RUN: %t/test-no-undeclared-includes.m -verify

	// The PCM file considers 'd.modulemap' an input because it affects the compilation,
	// although it doesn't describe the built module or its imports.
	//
	// RUN: %clang_cc1 -module-file-info %t/cache/c.pcm \| FileCheck %s --check-prefix=CHECK-D
	// CHECK-D: Input file: {{.*}}d.modulemap