This is an archive of the discontinued LLVM Phabricator instance.

[Modules] Add ability to specify module name to module file mapping in a file
Needs ReviewPublic

Authored by boris on Aug 30 2017, 8:46 AM.

Details

Summary

Add the -fmodule-file-map=[<prefix>=]<file> option which can be used to specify a file that contains module name to precompiled modules files mapping, similar to -fmodule-file=<name>=<file>. The <prefix> can be used to only consider certain lines which can be useful if we want to store the mapping in an already existing file (for example, as comments in the .d makefile fragment generated with the -M option or some such).

Additional notes:

  1. Based on this mailing list discussion: http://lists.llvm.org/pipermail/cfe-dev/2017-June/054431.html
  1. Based on the functionality implemented in: https://reviews.llvm.org/D35020

Diff Detail

Event Timeline

boris created this revision.Aug 30 2017, 8:46 AM

Couple of drive by comments - no doubt Richard will have a more in depth review, but figured this might save a round or two.

lib/Frontend/CompilerInvocation.cpp
1564–1568

Run the whole change through clang-format (there are tools to help do this over a patch or git revision range) so it matches LLVM coding conventions

1579

Prefer copy init over direct init:

for (size_t B = 0, E = 0;

& probably != rather than < is more canonical/common in the LLVM codebase.

1589

There's probably a StringRef-y way to do this rather than going back through raw pointers? (StringRef::substr or the like)

boris marked 3 inline comments as done.Sep 4 2017, 10:54 AM

David, thanks for the review. Uploading the new revision (also rebased on HEAD).

lib/Frontend/CompilerInvocation.cpp
1579

All done except the != change -- B can actually go one over size (see the npos case in the loop body).

boris updated this revision to Diff 113779.Sep 4 2017, 10:56 AM
boris marked an inline comment as done.

New patch revision with David's comments addressed.

bruno added a subscriber: bruno.Sep 22 2017, 11:50 AM

Hi Boris,

This is a handy option, very nice. Can you also add testcases?

lib/Frontend/CompilerInvocation.cpp
1564

auto Buf -> auto *Buf

rsmith edited edge metadata.Sep 22 2017, 3:05 PM

I wonder whether this feature really pulls its weight compared to using @file and prepending -fmodule-file= to each line.

The big difference between this patch and an @file seems to be the filtering by prefix / storing this information as a subset of the data in a file. Can you say a bit more about how you're envisioning this integrating into build systems, such that that would be a natural way to model this mapping?

boris added a comment.Sep 22 2017, 4:34 PM

Yes, the main "feature" of this approach compared to @file is the ability to reuse an already existing file to store this information. Most build systems that support C/C++ compilation have to store auxiliary dependency information at least for the extracted header dependencies (those .d files generated by the -M option family, for example) but some also store hashes of options, compiler version/signature, etc. So instead of creating a yet another file (per translation unit), the idea is to reuse the already existing one by storing the mapping with some "distinguishing" prefix. As a concrete example, a make-based build system could append it to the .d file (which is a makefile fragment) as comments. How exactly this information is extracted is still an open question but I think this approach is generic enough to accommodate a wide range of possibilities (for example, -M could produce this information or the build system could append it itself after the -M is done).

boris updated this revision to Diff 116438.Sep 22 2017, 4:42 PM
boris marked an inline comment as done.

New revision this time with the tests (which got dropped from the earlier revision diff for some reason).

boris updated this revision to Diff 116442.Sep 22 2017, 4:54 PM

Another attempt to upload a clean diff (also rebased on latest HEAD).

rsmith added a comment.Nov 3 2017, 3:19 PM

I'm still unconvinced that this mechanism is worth supporting.

The use case of putting this information into .d files doesn't make sense to me. .d files are generally outputs of prior compilations, whereas this information is a compilation input, needed in every compilation (including the first), and must be eagerly updated if a change to the source code adds a new dependency.

boris added a comment.Nov 4 2017, 12:04 AM

Hi Richard,

Thanks for still entertaining the idea. Yes, I believe, in order to support modules (TS) the build system will have to extract module (and header while at it) dependency information prior to compilation rather than as a byproduct of compilation (which is how most build systems do it now). I've written on how all this can work (as well as the support that would be required from the compiler) here:

https://build2.org/article/cxx-modules-misconceptions.xhtml#build

FWIW, we do it this way in build2 (and already have a .d file that we would like to reuse) and get 3x speedup using Clang on a modularized build. Also Gaby told me that he is working on a tool for VC that will do extraction of header and module dependency information (as a single preprocessor pass).

boris added a comment.Nov 17 2017, 4:17 AM

Ping.

Also, to add to my previous comment, even if for a moment we ignore the header dependencies and when they are extracted, a modern build system normally tracks other kinds of what can be called "auxiliary dependency information": some form of compiler signature, hash of options that were used to compile the file, etc., so that when any of those change, the object file gets recompiled automatically. For example, in build2, we store all these signatures/hashes plus the header and module dependency information in a single .d file (which we call auxiliary dependency database). What I am trying to show by this is that it is well established that for C/C++ compilation there has to be an extra file for each .o where such information is stored. And it seems natural to want to reuse this file for supplying the "module map" to the compiler instead of creating yet another per-.o file.