We add -fmodules-use-prebuilt-modules to support using prebuilt modules. In this mode, there is no need to load any module map and the programmer can simply use "@import" syntax to load the module directly from module cache without parsing any module map. We don't support rebuilding of the modules and ignore configuration mismatches.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
How about -fmodules-use-prebuilt-module-cache for the flag name? Saying "prebuilt-modules" is confusing to me, since -fmodule-file can also be used to load a prebuilt module, but doesn't use a cache.
lib/Driver/Tools.cpp | ||
---|---|---|
5416 ↗ | (On Diff #66677) | if (IsPrebuilt) { |
lib/Frontend/CompilerInstance.cpp | ||
1491 ↗ | (On Diff #66677) | Why do we assume this? |
Sorry for the delay, I apparently forgot to hit submit. Replied inline.
lib/Frontend/CompilerInstance.cpp | ||
---|---|---|
1491 ↗ | (On Diff #66677) | I guess treating it as system is the more forgiving option, so I'm okay with keeping it. It's probably worth adding a comment that this is not always the correct choice, since it can affect diagnostics, etc. |
Oops, forgot to address your other comments.
Yes, I will change the flag name as you suggested and use if (IsPrebuilt).
Cheers,
Manman
This doesn't seem like quite the right user interface for this feature. The module cache is volatile, and it's not really reasonable for people to assume they know what is and isn't within it. Instead, it seems like what we want to provide is zero or more paths to directories containing prebuilt .pcm files whose file names are the same as their top-level module names, completely separate from the module cache.
Specifically, rather than spelling this as -fmodules-use-prebuilt-modules -fmodules-cache-path=X, I'd suggest spelling it as -fprebuilt-module-path=X or similar (which can be repeated to provide multiple search paths). That would then combine orthogonally with -fno-implicit-modules and -fno-implicit-module-maps, where implicit modules would still use the module cache. Does that make sense for your use case?
You're also missing updates to the modules documentation.
lib/Frontend/CompilerInstance.cpp | ||
---|---|---|
1483 ↗ | (On Diff #67431) | This looks wrong: you're asking ReadAST to not diagnose a configuration mismatch, and you also don't diagnose it yourself. |
1502–1510 ↗ | (On Diff #67431) | You requested that ReadAST produces diagnostics for the OutOfDate and Missing cases when using prebuilt modules, so this diagnostic is redundant. |
lib/Serialization/ASTReader.cpp | ||
495–498 ↗ | (On Diff #67431) | This seems unnecessarily complicated. Perhaps you could make CompilerInstance::loadModule treat all prebuilt modules as MK_ExplicitModule, or add a third ModuleKind for them? |
What we want is to achieve performance by prebuilding the modules and using @import to load the corresponding pcm without parsing the module maps. In ASTReader, we want to treat these modules in the same way as the explicit modules.
-fprebuilt-module-path sounds reasonable. Let me see how the implementation goes ...
You're also missing updates to the modules documentation.
Yes, I will update the document.
Cheers,
Manman
lib/Frontend/CompilerInstance.cpp | ||
---|---|---|
1502–1510 ↗ | (On Diff #67431) | Yes, I can remove this diagnostic. |
lib/Serialization/ASTReader.cpp | ||
495–498 ↗ | (On Diff #67431) | I tried adding a ModuleKind at some point, but somehow it didn't work out. I will take another look at that. |
include/clang/Lex/HeaderSearchOptions.h | ||
---|---|---|
96–97 ↗ | (On Diff #67621) | It would seem preferable to allow multiple of these, to support mixing (for instance) a path to prebuilt system modules and a path to some per-project modules. I'd be fine handling that as a separate patch / discussion if you prefer. |
lib/Frontend/CompilerInstance.cpp | ||
1438–1450 ↗ | (On Diff #67621) | It looks like the logic here is:
I think that may be surprising; if I explicitly load a module map, perhaps via -fmodule-map-file=, and supply a prebuilt form of that module in the prebuilt modules path, I'd expect my prebuilt form to still be used. Perhaps instead we could first search the prebuilt module path for an existing .pcm file, and if that fails and we have a Module then look for it in the module cache? |
1497–1505 ↗ | (On Diff #67621) | The module file will have provided a Module instance, if it is in fact a module file for the requested module. Instead of inventing a Module here, can you check that one exists and produce an error if not? |
lib/Serialization/ASTReader.cpp | ||
2273 ↗ | (On Diff #67621) | You're allowing configuration mismatches that we don't consider to be "compatible configuration mismatches" here. That seems really scary -- I believe this allows using a C++ module from C and similar unreasonable things, that will cause clang to crash. Did you really mean to do that? |
Thanks,
Manman
include/clang/Lex/HeaderSearchOptions.h | ||
---|---|---|
96–97 ↗ | (On Diff #67621) | It sounds reasonable to provide prebuilt system modules and prebuilt project modules. I assume via the same command-line option -fprebuilt-module-path. |
lib/Frontend/CompilerInstance.cpp | ||
1438–1450 ↗ | (On Diff #67621) | I was wondering about the priority of loading from prebuilt module path vs. loading from module cache. Your suggestion makes sense. |
1497–1505 ↗ | (On Diff #67621) | This is nice to know. I didn't realize we construct a Module when calling ReadAST. Can you be more specific on how to check if a Module exists? |
lib/Serialization/ASTReader.cpp | ||
2273 ↗ | (On Diff #67621) | I originally had this in my testing patch to be really relaxing on prebuilt modules. I will remove this :] |
lib/Frontend/CompilerInstance.cpp | ||
---|---|---|
1438–1450 ↗ | (On Diff #67621) | +1 to Richard's suggestion here. I wonder about the workflow on generating such modules. Suppose I want to build some modules to re-use later with -fprebuilt-module-path=. RIght now, we get a module cache with hash in the names unless we disable hash computation via cc1 option -fdisable-module-hash. Should we assume that the user will have to copy out the cache dir and rename the modules OR that it will use -fdisable-module-hash instead? Should this option also try to read out modules with a hash in the name? Should we promote -fdisable-module-hash to a driver option as well? @Manman, do you have any workflow in mind? @Richard, do you have any feedback in this respect? In the crash reproducer we often have to rebuild the modules when running the script in a different machine because of the paths used to hash the modules, it would be nice (depending on the resulting design) if the crash reproducer could use -fprebuilt-module-path in order to try to re-use the pre-built modules resulting from the crash. |
lib/Frontend/CompilerInstance.cpp | ||
---|---|---|
1438–1450 ↗ | (On Diff #67621) | If the modules are built implicitly with module hash, to load the module files as prebuilt modules, the user needs to copy out the cache dir and rename the modules. The user can choose to emit the modules explicitly or disable module hash when building the modules. |
1497–1505 ↗ | (On Diff #67621) | I assumed ReadAST only constructed Module instances for submodules, because it is inside ReadSubmoduleBlock. It turns out it also constructed Module instance for the top level module. I will update the patch. |
Just a couple of things, but this basically LGTM.
docs/Modules.rst | ||
---|---|---|
217 ↗ | (On Diff #67727) | Please document that the option can be given multiple times. |
lib/Frontend/CompilerInstance.cpp | ||
1502 ↗ | (On Diff #67727) | This should have a real diagnostic rather than an assertion. This would happen if the pcm file in the prebuilt module path had the wrong filename, for instance. It'd also be good to check that Module->ASTFile actually refers to the file that we found in the prebuilt module path, and not (say) to one of its transitive dependencies that ReadAST also loaded. |
Thanks,
Manman
lib/Frontend/CompilerInstance.cpp | ||
---|---|---|
1502 ↗ | (On Diff #68442) | Updated the patch to use error diagnostics instead of assertion. <<< Did we already check this when creating the module in ASTReader? if (!ParentModule) { if (const FileEntry *CurFile = CurrentModule->getASTFile()) { if (CurFile != F.File) { ... } } CurrentModule->setASTFile(F.File); } |
lib/Frontend/CompilerInstance.cpp | ||
---|---|---|
1502 ↗ | (On Diff #68442) | The AST reader has no idea which module we're trying to load a module file for, so it can't check this. The check you quoted is ensuring that we don't have two different module files providing the same module. A testcase for this would be: module A is in prebuilt-module-cache/x.pcm module B is in prebuilt-module-cache/A.pcm and depends on x.pcm When trying to load module A, ReadAST will load A.pcm, and we'll find that we now have a definition for module A, but we still didn't get the module from the expected file. |
Cheers,
Manman
lib/Frontend/CompilerInstance.cpp | ||
---|---|---|
1502 ↗ | (On Diff #68456) | Thanks for the explanation, it makes sense. Updated the patch accordingly. |
lib/Frontend/CompilerInstance.cpp | ||
---|---|---|
1503 ↗ | (On Diff #68456) | It'd be safer to check that FileMgr.getFile(ModuleFileName) == Module->getASTFile() in case the filename gets canonicalized by the file manager in some way. |
lib/Frontend/CompilerInstance.cpp | ||
---|---|---|
1503 ↗ | (On Diff #68456) | Yes you are right. I will update the patch and commit the change. Thanks for all the useful information! |