This is an archive of the discontinued LLVM Phabricator instance.

[C++20] [Modules] Use '-' as the separator of partitions when searching in filesystems
ClosedPublic

Authored by ChuanqiXu on Mar 3 2022, 12:42 AM.

Details

Summary

It is simpler to search for module unit by -fprebuilt-module-path option. However, the separator ':' of partitions is not friendly. According to the discussion in https://reviews.llvm.org/D118586, I think we get consensus to use '-' as the separator instead. The '-' is the choice of GCC too.

Previously I thought it would be better to add an option. But I feel it is over-engineering now. Another reason here is that there are too many options for modules (for clang module mainly) now. Given it is not bad to use '-' when searching, I think it is acceptable to not add an option.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Mar 3 2022, 12:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 12:42 AM
ChuanqiXu requested review of this revision.Mar 3 2022, 12:42 AM
ChuanqiXu added inline comments.
clang/test/Modules/search-partitions.cppm
20 ↗(On Diff #412619)

@MyDeveloperDay hi, I remember the support for partitions in clang-format is done in https://reviews.llvm.org/D114151. I think I did something wrong. I format my code with the following command in the root directory of llvm project:

git diff -U0 --no-color --relative HEAD^ | clang/tools/clang-format/clang-format-diff.py -p1 -i

Would it use an out-dated clang-format? If yes, how should I use the proper one. Or if is it possible that the .clang-format files is not updated correctly?

iains added inline comments.Mar 3 2022, 4:57 AM
clang/test/Modules/search-partitions.cppm
20 ↗(On Diff #412619)

FWIW, I have been using git clang-format .... with llvm-14 clang-format in my PATH and it seems to work OK there - but not for llvm <= 13 (on macOS 10.15).

ChuanqiXu updated this revision to Diff 412899.Mar 3 2022, 6:39 PM

Format tests.

clang/test/Modules/search-partitions.cppm
20 ↗(On Diff #412619)

It works now. Thanks!

Stupid question: this works with Windows as well? And the BTW sounds odd. gcc also decided to use a dash as the separator.

tschuett added inline comments.Mar 3 2022, 6:58 PM
clang/lib/Lex/HeaderSearch.cpp
199

Or: here clang and gcc choose ...

ChuanqiXu updated this revision to Diff 412928.Mar 3 2022, 9:40 PM

Address comments.

ChuanqiXu marked an inline comment as done.Mar 3 2022, 9:41 PM

Stupid question: this works with Windows as well? And the BTW sounds odd. gcc also decided to use a dash as the separator.

It should work with Windows since Windows should allow '-' in filename. The test shows this, too.

clang/lib/Lex/HeaderSearch.cpp
199

Thanks. It is helpful for non-native speakers : )

urnathan resigned from this revision.Mar 9 2022, 10:43 AM
ChuanqiXu marked an inline comment as done.Mar 14 2022, 8:25 PM

@iains ping

iains added a comment.Mar 23 2022, 4:41 AM

I have no comment the patch itself, that seems fine, I even agree that this could be convenient :-)

However, I am concerned that this now means that '-fprebuilt-module-path' has meaning for both clang modules and c++20 modules - which is something we agree is generally undesirable (AFAIU, we agree it would be better to have orthogonal command line switch sets).

At some stage, we need to abstract the general mapping between module names and BMI names; I think the ownership of that mapping needs to be in the module loader (or, at least, available when the compiler instance has no pre-processor).
This is because when building a pre-processed source that imports a module we could still need to carry out the module-name to BMI name translation (the alternative seems bad, which would be to bake the BMI path in the pre-processed output).

So, I'm not 100% convinced about lookup schemes for BMI names that depend on header search being available.

maybe other modules folks have a view?

iains added a subscriber: Restricted Project.Mar 23 2022, 4:42 AM

@vsapsai @dblaikie I want to make sure if this one would affect clang modules. Or simply, would the name of clang modules contain :? (: is the separator of C++20 modules partitions.)

@vsapsai @dblaikie I want to make sure if this one would affect clang modules. Or simply, would the name of clang modules contain :? (: is the separator of C++20 modules partitions.)

Module names in .modulemap should be valid C identifiers, if I remember ModuleMapParser code correctly. You can have submodules that are separated by . (no such meaning for C++20 modules). So in .modulemap you cannot specify a module with :. I'm not sure how you can specify a module name when you do that from the command line, need to ask around about that.

Also I think some team internally has a custom machinery with -fprebuilt-module-path. I'm trying to find them, so they can make sure it works with their approach.

By the way, tried a module name with a colon and ModuleMapParser doesn't accept it

include/module.modulemap:1:12: error: skipping stray token
module Test:Colon {
           ^
include/module.modulemap:1:13: error: expected '{' to start module 'Test'
module Test:Colon {
            ^
include/module.modulemap:1:13: error: expected module declaration
include/module.modulemap:1:19: error: expected module declaration
module Test:Colon {
                  ^
include/module.modulemap:2:1: error: expected module declaration
}
^

By the way, tried a module name with a colon and ModuleMapParser doesn't accept it

include/module.modulemap:1:12: error: skipping stray token
module Test:Colon {
           ^
include/module.modulemap:1:13: error: expected '{' to start module 'Test'
module Test:Colon {
            ^
include/module.modulemap:1:13: error: expected module declaration
include/module.modulemap:1:19: error: expected module declaration
module Test:Colon {
                  ^
include/module.modulemap:2:1: error: expected module declaration
}
^

So it looks like this wouldn't be conflict with clang modules, do I understand right?

arames added a subscriber: arames.Mar 30 2022, 2:51 AM

I was was asked to chime in to assess whether this patch could be a problem for
the prebuilt-implicit clang modules workflow. No problem here.
With prebuilt modules, the output file name has to be specified manually. So the
mapping does not change the existing requirement of managing the file names to
avoid collissions.
Even in the future with implicit modules - as noted by others - : and - are
not allowed in the clang module name in the .modulemap, so there cannot be any
conflict.

A couple comments on the way. Take them with a grain of salt, since I don't much
about how this is going to be used or what stage of the work this is.

I see the mapping is only applied on the "prebuilt" code path, and not the
"implicit" and "prebuilt implicit" code paths. I suppose at some point, module
generation will be implicit. So any particular handling of ModuleName to be
appended to the filesystem path will need to be done on different code paths. So
I would already abstract this away in a helper.

Without much of an informed opinion, I find @iains makes a valid point. Does
this kind of mapping (semantic module name to name being used for filesystem
search/storage) belong at a higher level ? Maybe that's to be answered or worked
on later as well.

iains accepted this revision.Mar 30 2022, 8:40 AM

I was was asked to chime in to assess whether this patch could be a problem for
the prebuilt-implicit clang modules workflow. No problem here.
With prebuilt modules, the output file name has to be specified manually. So the
mapping does not change the existing requirement of managing the file names to
avoid collissions.
Even in the future with implicit modules - as noted by others - : and - are
not allowed in the clang module name in the .modulemap, so there cannot be any
conflict.

great.

A couple comments on the way. Take them with a grain of salt, since I don't much
about how this is going to be used or what stage of the work this is.

I see the mapping is only applied on the "prebuilt" code path, and not the
"implicit" and "prebuilt implicit" code paths. I suppose at some point, module
generation will be implicit. So any particular handling of ModuleName to be
appended to the filesystem path will need to be done on different code paths. So
I would already abstract this away in a helper.

At this time I do not think that the 'implicit' pathways are in use for 'C++ standard'
modules, (and it might be some time before we get to that point)

Without much of an informed opinion, I find @iains makes a valid point. Does
this kind of mapping (semantic module name to name being used for filesystem
search/storage) belong at a higher level ? Maybe that's to be answered or worked
on later as well.

Yes, the generalisation is in the short-is queue for things to be applied (it is a precursor
to support for P1184).

However, I think that this patch can go ahead in the mean time, to help short-term
workflows.

So this LGTM as it is.

This revision is now accepted and ready to land.Mar 30 2022, 8:40 AM
This revision was landed with ongoing or failed builds.Mar 30 2022, 8:46 PM
This revision was automatically updated to reflect the committed changes.