This is an archive of the discontinued LLVM Phabricator instance.

[C++] [Modules] Support one phase compilation model for named modules
AbandonedPublic

Authored by ChuanqiXu on Sep 20 2022, 2:02 AM.

Details

Summary

This patch tries to make clang to generate the BMI implicitly in the module cache, which allow the user to compile a hello world example in one command line.

This also tries to fix the problem I raised a year ago: https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144

More importantly, the patch ease the implementation for build system writers, which makes their initial support more easily. For example, in this branch https://github.com/ChuanqiXu9/llvm-project/tree/MyP1689, we can compile the HelloWorld examples by cmake with some little changes.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Sep 20 2022, 2:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 20 2022, 2:02 AM
ChuanqiXu requested review of this revision.Sep 20 2022, 2:02 AM
  • Add ReleaseNotes.
  • Refine the behavior about partitions.

This also tries to fix the problem I raised a year ago: https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144

I think this thread is fairly different from what this patch is proposing.

@rsmith for discussion of adding effectively implicit modules support for C++20 modules - this seems like a pretty significant design choice, but something we (you and I, SG15, etc) have discussed before in terms of "how do we support 'Hello, World.'" situations without requiring something that amounts to a build system even for the simplest C++20 modules programs. Maybe this is it? But it surprises me a bit, and I don't think the linked thread is sufficiently relevant to draw conclusions about this direction.

ChuanqiXu added a reviewer: rsmith.EditedOct 7 2022, 8:49 PM

This also tries to fix the problem I raised a year ago: https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144

I think this thread is fairly different from what this patch is proposing.

@rsmith for discussion of adding effectively implicit modules support for C++20 modules - this seems like a pretty significant design choice, but something we (you and I, SG15, etc) have discussed before in terms of "how do we support 'Hello, World.'" situations without requiring something that amounts to a build system even for the simplest C++20 modules programs. Maybe this is it? But it surprises me a bit, and I don't think the linked thread is sufficiently relevant to draw conclusions about this direction.

I don't understand your point a lot. From my point, the intention of the thread (https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144) are asking why the command line interfaces of clang are clearly more complex than gcc. And @rsmith replies that:

Currently, Clang lacks two important features here:

  1. Produce a .pcm file and a .o file from a single compilation action.

in https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144/12. And I think this is what this patch tries to do: (generate .pcm file and .o file in one single compilation action).

In fact, this doesn't change the behavior a lot. Previously, when we tried to compile a *.cppm directly to *.o, the compiler will generate *.pcm automatically too! Although the *.pcm file is only generated in /tmp/ files by default. And what this patch did is only to change the default location for *.pcm files from /tmp to module cache path. And it wouldn't affect the original 2 phase compilation model.

BTW, this can simplify the use of modules significantly, see https://reviews.llvm.org/D135507#change-gpsoCkM1g61J for example.

This also tries to fix the problem I raised a year ago: https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144

I think this thread is fairly different from what this patch is proposing.

@rsmith for discussion of adding effectively implicit modules support for C++20 modules - this seems like a pretty significant design choice, but something we (you and I, SG15, etc) have discussed before in terms of "how do we support 'Hello, World.'" situations without requiring something that amounts to a build system even for the simplest C++20 modules programs. Maybe this is it? But it surprises me a bit, and I don't think the linked thread is sufficiently relevant to draw conclusions about this direction.

I don't understand your point a lot. From my point, the intention of the thread (https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144) are asking why the command line interfaces of clang are clearly more complex than gcc. And @rsmith replies that:

Currently, Clang lacks two important features here:

  1. Produce a .pcm file and a .o file from a single compilation action.

in https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144/12. And I think this is what this patch tries to do: (generate .pcm file and .o file in one single compilation action).

In fact, this doesn't change the behavior a lot. Previously, when we tried to compile a *.cppm directly to *.o, the compiler will generate *.pcm automatically too! Although the *.pcm file is only generated in /tmp/ files by default. And what this patch did is only to change the default location for *.pcm files from /tmp to module cache path. And it wouldn't affect the original 2 phase compilation model.

BTW, this can simplify the use of modules significantly, see https://reviews.llvm.org/D135507#change-gpsoCkM1g61J for example.

The use of a module cache path I think is a pretty major difference between what was discussed on the discourse thread and what's being proposed here - a module "cache" is what starts to touch near Clang's old implicit modules that has real problems in terms of parallelism, etc. (granted what you're proposing here doesn't go all the way towards that - it doesn't have implicit actions to generate the pcm, at least)

Is this what GCC Supports? I'd expect something more like Split DWARF, where the .pcm (or .dwo file) is placed next to the .o file - not in some separate directory (what sort of naming scheme would be used to ensure that paths in that directory are unique?).

This also tries to fix the problem I raised a year ago: https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144

I think this thread is fairly different from what this patch is proposing.

@rsmith for discussion of adding effectively implicit modules support for C++20 modules - this seems like a pretty significant design choice, but something we (you and I, SG15, etc) have discussed before in terms of "how do we support 'Hello, World.'" situations without requiring something that amounts to a build system even for the simplest C++20 modules programs. Maybe this is it? But it surprises me a bit, and I don't think the linked thread is sufficiently relevant to draw conclusions about this direction.

I don't understand your point a lot. From my point, the intention of the thread (https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144) are asking why the command line interfaces of clang are clearly more complex than gcc. And @rsmith replies that:

Currently, Clang lacks two important features here:

  1. Produce a .pcm file and a .o file from a single compilation action.

in https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144/12. And I think this is what this patch tries to do: (generate .pcm file and .o file in one single compilation action).

In fact, this doesn't change the behavior a lot. Previously, when we tried to compile a *.cppm directly to *.o, the compiler will generate *.pcm automatically too! Although the *.pcm file is only generated in /tmp/ files by default. And what this patch did is only to change the default location for *.pcm files from /tmp to module cache path. And it wouldn't affect the original 2 phase compilation model.

This also tries to fix the problem I raised a year ago: https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144

I think this thread is fairly different from what this patch is proposing.

@rsmith for discussion of adding effectively implicit modules support for C++20 modules - this seems like a pretty significant design choice, but something we (you and I, SG15, etc) have discussed before in terms of "how do we support 'Hello, World.'" situations without requiring something that amounts to a build system even for the simplest C++20 modules programs. Maybe this is it? But it surprises me a bit, and I don't think the linked thread is sufficiently relevant to draw conclusions about this direction.

I don't understand your point a lot. From my point, the intention of the thread (https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144) are asking why the command line interfaces of clang are clearly more complex than gcc. And @rsmith replies that:

Currently, Clang lacks two important features here:

  1. Produce a .pcm file and a .o file from a single compilation action.

in https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144/12. And I think this is what this patch tries to do: (generate .pcm file and .o file in one single compilation action).

In fact, this doesn't change the behavior a lot. Previously, when we tried to compile a *.cppm directly to *.o, the compiler will generate *.pcm automatically too! Although the *.pcm file is only generated in /tmp/ files by default. And what this patch did is only to change the default location for *.pcm files from /tmp to module cache path. And it wouldn't affect the original 2 phase compilation model.

BTW, this can simplify the use of modules significantly, see https://reviews.llvm.org/D135507#change-gpsoCkM1g61J for example.

The use of a module cache path I think is a pretty major difference between what was discussed on the discourse thread and what's being proposed here - a module "cache" is what starts to touch near Clang's old implicit modules that has real problems in terms of parallelism, etc. (granted what you're proposing here doesn't go all the way towards that - it doesn't have implicit actions to generate the pcm, at least)

Is this what GCC Supports? I'd expect something more like Split DWARF, where the .pcm (or .dwo file) is placed next to the .o file - not in some separate directory (what sort of naming scheme would be used to ensure that paths in that directory are unique?).

Oh, I see. So you're not against the idea of producing a .pcm file and .o file in a single compilation. So we don't have any fundamental conflicts.

Is this what GCC Supports?

It looks similar to me. GCC will generate and search the bmi (they called it .cmi files) in the {current_dir}/gcm.cache directories automatically. But I don't find the option to specify the cache directory in the gcc manual. So I guess the path is fixed in gcc. The difference in this patch is that we allows the user to specify the position of the cache dir. Then I was wondering - if I want a path to store modules cache and there is an existing option -fmodules-cache-path, can I reuse it? - then here is the patch.

a module "cache" is what starts to touch near Clang's old implicit modules that has real problems in terms of parallelism

Could you elaborate more on this? Do you mean it is bad that the C++20 Modules share the cache path with clang modules? Or is it bad about the idea of cache path?

iains added a comment.EditedOct 10 2022, 12:15 AM

@ChuanqiXu BTW, I think we (unfortunately) have some overlap in work in progress here;

I have implemented a similar scheme to GCC's where C/BMIs can be produced "on demand" when the FE encounters the module keywords. The idea was to make it possible to have the same command lines on both compilers (basically to remove the need for the user to identify that a c++ source is is modular - with the sole exception of header units which the compiler cannot figure out on its own).

This scheme interfaces with the module-mapper approach (but how the mapping is determined could just as easily be from command line flags, module map files etc. etc. that is the beauty of that approach - the compiler has no need to know about how the mapping is made - it can be a fixed in-process scheme (like using the module cache path + some well-known mapping of names) or a full-blown build system - e.g. attaching to build2 (I think Boris did try out an earlier version of my implementation.)

draft patches (need rebasing) here: D118896, D118898,D118895, D118899 (these do not include the libCody stuff, which would support the module mapper, but concentrate on the compiler side).
edit: D118895 is probably no longer relevant.

Reworking/rebasing this scheme and posting the patches is next on my TODO after (current work on) P1815 part 2.

Hopefully, there is some way to combine these pieces of work to give the user greatest flexibility.

@ChuanqiXu BTW, I think we (unfortunately) have some overlap in work in progress here;

I have implemented a similar scheme to GCC's where C/BMIs can be produced "on demand" when the FE encounters the module keywords. The idea was to make it possible to have the same command lines on both compilers (basically to remove the need for the user to identify that a c++ source is is modular - with the sole exception of header units which the compiler cannot figure out on its own).

This scheme interfaces with the module-mapper approach (but how the mapping is determined could just as easily be from command line flags, module map files etc. etc. that is the beauty of that approach - the compiler has no need to know about how the mapping is made - it can be a fixed in-process scheme (like using the module cache path + some well-known mapping of names) or a full-blown build system - e.g. attaching to build2 (I think Boris did try out an earlier version of my implementation.)

draft patches (need rebasing) here: D118896, D118898,D118895, D118899 (these do not include the libCody stuff, which would support the module mapper, but concentrate on the compiler side).
edit: D118895 is probably no longer relevant.

Reworking/rebasing this scheme and posting the patches is next on my TODO after (current work on) P1815 part 2.

Hopefully, there is some way to combine these pieces of work to give the user greatest flexibility.

@iains oh, I remembered just now that you said you'd like to take this in one of GitHub issues long ago. My bad. I forgot it when I started this. I think you can add me as subscriber such drafts next time to avoid similar things happen again.
In fact, I had some drafts about modules and they conflicted with yours previously. So I'm sure I understand this : (

And for this concrete job (generate pcm and object file in one compilation job), I think my implementation looks better. It is shorter and reuses the current behavior (generate .pcm files in /tmp directories and we just redirect it). So I think it may be better to pursue this patch for this task. (The option design could be changed of course).


(following off may not be related to this patch)
And it is definitely valuable to have command line interfaces compatibility between clang and gcc. But for the libcody things, we must need a RFC in the discourse first. Also for the module mapper things, I'm not sure if we should pursue it. (I'm not against it. I'm just hesitating). For example, I am trying to implement the proposal P1689 from kitware guys in https://github.com/ChuanqiXu9/llvm-project/tree/MyP1689/P1689Examples/HelloWorld. And it looks good that we can compile a hello world example now. So I'm not sure if the gcc's module mapper is necessary here. Maybe we need more discussing.

iains added a comment.Oct 10 2022, 1:54 AM

@ChuanqiXu BTW, I think we (unfortunately) have some overlap in work in progress here;

I have implemented a similar scheme to GCC's where C/BMIs can be produced "on demand" when the FE encounters the module keywords. The idea was to make it possible to have the same command lines on both compilers (basically to remove the need for the user to identify that a c++ source is is modular - with the sole exception of header units which the compiler cannot figure out on its own).

This scheme interfaces with the module-mapper approach (but how the mapping is determined could just as easily be from command line flags, module map files etc. etc. that is the beauty of that approach - the compiler has no need to know about how the mapping is made - it can be a fixed in-process scheme (like using the module cache path + some well-known mapping of names) or a full-blown build system - e.g. attaching to build2 (I think Boris did try out an earlier version of my implementation.)

draft patches (need rebasing) here: D118896, D118898,D118895, D118899 (these do not include the libCody stuff, which would support the module mapper, but concentrate on the compiler side).
edit: D118895 is probably no longer relevant.

Reworking/rebasing this scheme and posting the patches is next on my TODO after (current work on) P1815 part 2.

Hopefully, there is some way to combine these pieces of work to give the user greatest flexibility.

@iains oh, I remembered just now that you said you'd like to take this in one of GitHub issues long ago. My bad. I forgot it when I started this. I think you can add me as subscriber such drafts next time to avoid similar things happen again.
In fact, I had some drafts about modules and they conflicted with yours previously. So I'm sure I understand this : (

And for this concrete job (generate pcm and object file in one compilation job), I think my implementation looks better. It is shorter and reuses the current behavior (generate .pcm files in /tmp directories and we just redirect it). So I think it may be better to pursue this patch for this task. (The option design could be changed of course).

I need to review this patch some more - but at first looking it seems to be driver-only - so I am not sure how you are arranging for the PCM and the object to be created from a single invocation of the FE. The idea of my patch series is to match the behavior o GCC - where the FE can figure out (independently of any file suffix) that it needs to emit an BMI.


(following off may not be related to this patch)

perhaps - but a small response here (we can take this to a different forum if needed)

And it is definitely valuable to have command line interfaces compatibility between clang and gcc. But for the libcody things, we must need a RFC in the discourse first.

Well the idea had also been discussed among the 'modules group' but. sure we can this - however the module-mapper is not required to make this initial patch series useful on its own - it is about automatically producing the BMI without special command lines.

Also for the module mapper things, I'm not sure if we should pursue it. (I'm not against it. I'm just hesitating). For example, I am trying to implement the proposal P1689 from kitware guys in https://github.com/ChuanqiXu9/llvm-project/tree/MyP1689/P1689Examples/HelloWorld. And it looks good that we can compile a hello world example now. So I'm not sure if the gcc's module mapper is necessary here. Maybe we need more discussing.

There is almost certainly more than one solution to this and different solutions will fit different scenarios - Cmake and build2 are examples, but actually it's quite nice with a small project to be able to use an in-process resolver for names and not require any external build system. I realise a lot of discussion in SG15 is targeted at the larger scale problems, but we should also provide (as you suggest) for "hello world" - preferably without the user having to install a separate package to make it work.

@ChuanqiXu BTW, I think we (unfortunately) have some overlap in work in progress here;

I have implemented a similar scheme to GCC's where C/BMIs can be produced "on demand" when the FE encounters the module keywords. The idea was to make it possible to have the same command lines on both compilers (basically to remove the need for the user to identify that a c++ source is is modular - with the sole exception of header units which the compiler cannot figure out on its own).

This scheme interfaces with the module-mapper approach (but how the mapping is determined could just as easily be from command line flags, module map files etc. etc. that is the beauty of that approach - the compiler has no need to know about how the mapping is made - it can be a fixed in-process scheme (like using the module cache path + some well-known mapping of names) or a full-blown build system - e.g. attaching to build2 (I think Boris did try out an earlier version of my implementation.)

draft patches (need rebasing) here: D118896, D118898,D118895, D118899 (these do not include the libCody stuff, which would support the module mapper, but concentrate on the compiler side).
edit: D118895 is probably no longer relevant.

Reworking/rebasing this scheme and posting the patches is next on my TODO after (current work on) P1815 part 2.

Hopefully, there is some way to combine these pieces of work to give the user greatest flexibility.

@iains oh, I remembered just now that you said you'd like to take this in one of GitHub issues long ago. My bad. I forgot it when I started this. I think you can add me as subscriber such drafts next time to avoid similar things happen again.
In fact, I had some drafts about modules and they conflicted with yours previously. So I'm sure I understand this : (

And for this concrete job (generate pcm and object file in one compilation job), I think my implementation looks better. It is shorter and reuses the current behavior (generate .pcm files in /tmp directories and we just redirect it). So I think it may be better to pursue this patch for this task. (The option design could be changed of course).

I need to review this patch some more - but at first looking it seems to be driver-only - so I am not sure how you are arranging for the PCM and the object to be created from a single invocation of the FE. The idea of my patch series is to match the behavior o GCC - where the FE can figure out (independently of any file suffix) that it needs to emit an BMI.

Let me add some more words to ease the understand. The secret here is that clang knows it is compiling a module unit before parsing - by the suffix .cppm or specified -xc++-module explicitly. And the clang would always generate the pcm file and the object file for the module unit in the past time too. The original behavior to compile a *.cppm to *.o would be:

  1. Compile *.cppm to /tmp/random-name.pcm
  2. Compile /tmp/random-name.pcm to *.o

And the only change this patch made is to change the destination of /tmp/random-name.pcm.

BTW, the meaning of --precompile (or -emit-module-interface in Xclang) is to stop at the precompilation step instead of doing precompilation. Since precompilation would always happen for module unit.


(following off may not be related to this patch)

perhaps - but a small response here (we can take this to a different forum if needed)

And it is definitely valuable to have command line interfaces compatibility between clang and gcc. But for the libcody things, we must need a RFC in the discourse first.

Well the idea had also been discussed among the 'modules group' but. sure we can this - however the module-mapper is not required to make this initial patch series useful on its own - it is about automatically producing the BMI without special command lines.

Also for the module mapper things, I'm not sure if we should pursue it. (I'm not against it. I'm just hesitating). For example, I am trying to implement the proposal P1689 from kitware guys in https://github.com/ChuanqiXu9/llvm-project/tree/MyP1689/P1689Examples/HelloWorld. And it looks good that we can compile a hello world example now. So I'm not sure if the gcc's module mapper is necessary here. Maybe we need more discussing.

There is almost certainly more than one solution to this and different solutions will fit different scenarios - Cmake and build2 are examples, but actually it's quite nice with a small project to be able to use an in-process resolver for names and not require any external build system. I realise a lot of discussion in SG15 is targeted at the larger scale problems, but we should also provide (as you suggest) for "hello world" - preferably without the user having to install a separate package to make it work.

Yeah, experimenting should be good. I'll try to do more.

iains added a comment.Oct 10 2022, 2:42 AM

@ChuanqiXu BTW, I think we (unfortunately) have some overlap in work in progress here;

I have implemented a similar scheme to GCC's where C/BMIs can be produced "on demand" when the FE encounters the module keywords. The idea was to make it possible to have the same command lines on both compilers (basically to remove the need for the user to identify that a c++ source is is modular - with the sole exception of header units which the compiler cannot figure out on its own).

This scheme interfaces with the module-mapper approach (but how the mapping is determined could just as easily be from command line flags, module map files etc. etc. that is the beauty of that approach - the compiler has no need to know about how the mapping is made - it can be a fixed in-process scheme (like using the module cache path + some well-known mapping of names) or a full-blown build system - e.g. attaching to build2 (I think Boris did try out an earlier version of my implementation.)

draft patches (need rebasing) here: D118896, D118898,D118895, D118899 (these do not include the libCody stuff, which would support the module mapper, but concentrate on the compiler side).
edit: D118895 is probably no longer relevant.

@ChuanqiXu BTW, I think we (unfortunately) have some overlap in work in progress here;

I have implemented a similar scheme to GCC's where C/BMIs can be produced "on demand" when the FE encounters the module keywords. The idea was to make it possible to have the same command lines on both compilers (basically to remove the need for the user to identify that a c++ source is is modular - with the sole exception of header units which the compiler cannot figure out on its own).

This scheme interfaces with the module-mapper approach (but how the mapping is determined could just as easily be from command line flags, module map files etc. etc. that is the beauty of that approach - the compiler has no need to know about how the mapping is made - it can be a fixed in-process scheme (like using the module cache path + some well-known mapping of names) or a full-blown build system - e.g. attaching to build2 (I think Boris did try out an earlier version of my implementation.)

draft patches (need rebasing) here: D118896, D118898,D118895, D118899 (these do not include the libCody stuff, which would support the module mapper, but concentrate on the compiler side).
edit: D118895 is probably no longer relevant.

Reworking/rebasing this scheme and posting the patches is next on my TODO after (current work on) P1815 part 2.

Hopefully, there is some way to combine these pieces of work to give the user greatest flexibility.

@iains oh, I remembered just now that you said you'd like to take this in one of GitHub issues long ago. My bad. I forgot it when I started this. I think you can add me as subscriber such drafts next time to avoid similar things happen again.
In fact, I had some drafts about modules and they conflicted with yours previously. So I'm sure I understand this : (

And for this concrete job (generate pcm and object file in one compilation job), I think my implementation looks better. It is shorter and reuses the current behavior (generate .pcm files in /tmp directories and we just redirect it). So I think it may be better to pursue this patch for this task. (The option design could be changed of course).

I need to review this patch some more - but at first looking it seems to be driver-only - so I am not sure how you are arranging for the PCM and the object to be created from a single invocation of the FE. The idea of my patch series is to match the behavior o GCC - where the FE can figure out (independently of any file suffix) that it needs to emit an BMI.

Let me add some more words to ease the understand. The secret here is that clang knows it is compiling a module unit before parsing - by the suffix .cppm or specified -xc++-module explicitly. And the clang would always generate the pcm file and the object file for the module unit in the past time too. The original behavior to compile a *.cppm to *.o would be:

  1. Compile *.cppm to /tmp/random-name.pcm
  2. Compile /tmp/random-name.pcm to *.o

And the only change this patch made is to change the destination of /tmp/random-name.pcm.

BTW, the meaning of --precompile (or -emit-module-interface in Xclang) is to stop at the precompilation step instead of doing precompilation. Since precompilation would always happen for module unit.

OK, that is clear then, here is the main difference in approach (and that reflects the difference between the GCC scheme and the current clang one).

for my patch set:

  1. the requirement for special names (.cppm ) or special tags -xc++-module for the driver is removed, since the FE will be able to determine that a file is a modular one and produce the BMI as needed.
  2. The BMI is not produced and moved, it is produced only when required and in place.
  3. there is only one invocation of the FE, which produces both artefacts.

Now, of course, both approaches are reasonable - but for my 0.02GBP I agree with several folks who would say that the user should not have to use special filename extensions for "standard C++".

<snip> second part for now we can come back to that as needed.

FAOD: I think this should not be about "who's patches" - but about the compilation model and command lines that we want to end up with.

I am concerned that with the current scheme (with or without the change in this review) we could end up with three process launches per modular file:

  • one to determine the deps
  • one to produce the BMI
  • one to produce the object

At least with GCC (and a patch series like the one I drafted) we can reduce that to 2

With the interfaces mentioned in P1184, it is plausible to make that 1 (at the expense of potentially many stalled builds - but stalled early in compilation, so hopefully with the smallest memory footprint we could achieve).

FAOD: I think this should not be about "who's patches" - but about the compilation model and command lines that we want to end up with.

Yeah, of course.

I am concerned that with the current scheme (with or without the change in this review) we could end up with three process launches per modular file:

  • one to determine the deps
  • one to produce the BMI
  • one to produce the object

At least with GCC (and a patch series like the one I drafted) we can reduce that to 2

With the interfaces mentioned in P1184, it is plausible to make that 1 (at the expense of potentially many stalled builds - but stalled early in compilation, so hopefully with the smallest memory footprint we could achieve).

I pretty believe the current scheme is better for the following reasons:
(1) The current scheme has higher paralellism potentially. See the Why two-phase compilation? section in https://reviews.llvm.org/D134269 for examples. I believe this should be the future direction to speedup the compilation.
(2) Your scheme saves the time for deserialization. However, clang implement many ODR checks in the deserialization part. Skipping the deserialization will omit these ODR checks, which will break the semantics. And this is the reason why that scheme may not be good.
(3) It makes implementation more complex.

For the first 2 reasons, I believe clang is in the right directions. Also for the filename suffixes, it is helpful for static analyzing tools to know whether or not if a file is module unit. So the static analyzing tools can scan every .cppm file in the working directory automatically in the preparation time. For example, if the build system are ready enough, we can omit the module files in the build script like: https://github.com/ChuanqiXu9/llvm-project/blob/94a71a7046a8672a7e06311b0c05c0a8c9ae4619/P1689Examples/HelloWorld/CMakeLists.txt#L18-L19.
I believe this is the reason why MSVC made a similar choice. (clang is easy to support MSVC's 'ixx' extension).

iains added a comment.EditedOct 10 2022, 11:55 PM

I am concerned that with the current scheme (with or without the change in this review) we could end up with three process launches per modular file:

  • one to determine the deps
  • one to produce the BMI
  • one to produce the object

At least with GCC (and a patch series like the one I drafted) we can reduce that to 2

With the interfaces mentioned in P1184, it is plausible to make that 1 (at the expense of potentially many stalled builds - but stalled early in compilation, so hopefully with the smallest memory footprint we could achieve).

I pretty believe the current scheme is better for the following reasons:
(1) The current scheme has higher paralellism potentially. See the Why two-phase compilation? section in https://reviews.llvm.org/D134269 for examples. I believe this should be the future direction to speedup the compilation.

That is a possibility, at the expense of more process launches - and I do not think that the corresponding model for the server-interactive approach has yet been fully explored.

We go to some trouble (with the driver invoking the compiler in-process) to avoid unnecessary process launches .. with the current state of hardware, it could well be more efficient to have many processes waiting rather than many processes launched.

(I am not making a claim here, just observing that the data are incomplete, since we do not have a corresponding model offered for the client-server case, and we do not have any kind of measurements - these are all "thought experiments").

(2) Your scheme saves the time for deserialization. However, clang implement many ODR checks in the deserialization part. Skipping the deserialization will omit these ODR checks, which will break the semantics. And this is the reason why that scheme may not be good.

What you seem to be saying here is that:

clang++ -std=c++20 foo.cxxm -c -o foo,o

edit; maybe I should have written -xc++ foo.cxxm

Does different ODR analysis from:

clang++ -std=c++20 foo.cxxm --precompile -o foo.pcm
clang++ -std=c++20 foo.pcm -o foo.o

If so, then that is of some concern since those are both valid compilations in the current compiler.

In the first case, the back end is connected as an AST consumer to the front - there is no serialise/deserialise. The model I implemented simply streams the AST instead of throwing it away ...

(repeating here, it is not the deserialisation that I think is key [although that is also useful] but minimising process launches).

(3) It makes implementation more complex.

Agreed, but it's reasonably self-contained.

For the first 2 reasons, I believe clang is in the right directions. Also for the filename suffixes, it is helpful for static analyzing tools to know whether or not if a file is module unit. So the static analyzing tools can scan every .cppm file in the working directory automatically in the preparation time. For example, if the build system are ready enough, we can omit the module files in the build script like: https://github.com/ChuanqiXu9/llvm-project/blob/94a71a7046a8672a7e06311b0c05c0a8c9ae4619/P1689Examples/HelloWorld/CMakeLists.txt#L18-L19.
I believe this is the reason why MSVC made a similar choice. (clang is easy to support MSVC's 'ixx' extension).

There seem to be two schools of thought. (1) It is good to use extensions to determine the file contents (2) we should not impose a requirement for magic extension names if the implementation can determine the content without.

For my part, I do prefer the second .. but of course will be happy to go with consensus,

FAOD: I think this should not be about "who's patches" - but about the compilation model and command lines that we want to end up with.

BTW, in the current context, when I say "my" and "your", I just want to refer the corresponding things. There is no means to offend.

(2) Your scheme saves the time for deserialization. However, clang implement many ODR checks in the deserialization part. Skipping the deserialization will omit these ODR checks, which will break the semantics. And this is the reason why that scheme may not be good.

What you seem to be saying here is that:

clang++ -std=c++20 foo.cxxm -c -o foo,o

edit; maybe I should have written -xc++ foo.cxxm

Does different ODR analysis from:

clang++ -std=c++20 foo.cxxm --precompile -o foo.pcm
clang++ -std=c++20 foo.pcm -o foo.o

If so, then that is of some concern since those are both valid compilations in the current compiler.

In the first case, the back end is connected as an AST consumer to the front - there is no serialise/deserialise.

No. Currently, in clang, the following two is the same:

clang++ -std=c++20 foo.cxxm -c -o foo.o

with

clang++ -std=c++20 foo.cxxm --precompile -o foo.pcm
clang++ -std=c++20 foo.pcm -o foo.o

The first compilation will generate .pcm files in the /tmp directories and compile the .pcm files in the /tmp directories.

We could verify it by inserting debugging informations. Or writing a ODR-mismatch example, then when the compiler emits an error, we'll find the compiler emits the error in the deserialization part. I met these 2 situations for many times : )

The model I implemented simply streams the AST instead of throwing it away ...

So that model misses the ODR checks, which is not good. Since it matters with the semantics.

(repeating here, it is not the deserialisation that I think is key [although that is also useful] but minimising process launches).

So from my point of view, it is much more important to preserve the semantics than minimizing the process launches.

That is a possibility, at the expense of more process launches - and I do not think that the corresponding model for the server-interactive approach has yet been fully explored.

We go to some trouble (with the driver invoking the compiler in-process) to avoid unnecessary process launches .. with the current state of hardware, it could well be more efficient to have many processes waiting rather than many processes launched.

(I am not making a claim here, just observing that the data are incomplete, since we do not have a corresponding model offered for the client-server case, and we do not have any kind of measurements - these are all "thought experiments").

Yeah, I agree that there more space to explore in the client-server model. But for the current building model, the 2 phase compilation is obviously better than the 1 phase compilation. And my point is, if one day we proved that there are many good points in the client/server model, it is not too late to add it in clang. And currently it is not a good reason to block the current patch, at least it is pretty simple and innocent.

There seem to be two schools of thought. (1) It is good to use extensions to determine the file contents (2) we should not impose a requirement for magic extension names if the implementation can determine the content without.

For my part, I do prefer the second .. but of course will be happy to go with consensus,

I'm confused the second. I mean what if the other static analyzing tools (including build system but not limited to) want to analyze the modular project, if they can only invoke the compiler to query the result. Then things look complex enough in the state. But assume we can implement great compiler interfaces. There are still problems. The tools need to pass the potential module unit files to the compiler. With the first choice, the tools can pass the module unit files precisely while with the second one, the input number will be much larger. So it looks really worse to me...


Since we discussed many things. Let me try to summarize to avoid missing the topic.

(1) For the client-server model, I think it is too far now. I am not against it. Also I don't feel the current simple patch would is not good for it.
(2) Your implementation skips the deserialization part, which misses many ODR checks. I think this is not good.
(3) I suddenly realize that your design and my design are not overlapped in some perspective. I mean the patch is working for *.cppm files, and your design is intended for *.cpp files. Then a question is: "what would the patch do for *.cppm files"? If we don't handle it specially, in your implementation, when we compile a .cppm file, the .pcm file would be generated twice. Once in the /tmp and once in the place you specified. I think this is not acceptable.

So in another shorter word, for the current states of clang, the current patch is more natural and more appropriate with the current states of clang. And for your patch, I feel like it is not so fit for the current clang. I'm not saying not fit the current states is not good. Otherwise we don't have any inventions. I just want to say it requires more work.

iains added a comment.Oct 11 2022, 1:20 AM

FAOD: I think this should not be about "who's patches" - but about the compilation model and command lines that we want to end up with.

BTW, in the current context, when I say "my" and "your", I just want to refer the corresponding things. There is no means to offend.

sure - no problem.

I guess we ought to be a bit more clear:
There is implementation divergence between GCC and clang and from the point to view of users and build systems having two different command line sets is not helpful.

The patches I drafted were aimed at removing that divergence.
If that is not possible (because the ODR analysis requires steps that make it very difficult or inefficient to do so) then some more thought and/or restructuring is needed.

(2) Your scheme saves the time for deserialization. However, clang implement many ODR checks in the deserialization part. Skipping the deserialization will omit these ODR checks, which will break the semantics. And this is the reason why that scheme may not be good.

What you seem to be saying here is that:

clang++ -std=c++20 foo.cxxm -c -o foo,o

edit; maybe I should have written -xc++ foo.cxxm

Does different ODR analysis from:

clang++ -std=c++20 foo.cxxm --precompile -o foo.pcm
clang++ -std=c++20 foo.pcm -o foo.o

If so, then that is of some concern since those are both valid compilations in the current compiler.

In the first case, the back end is connected as an AST consumer to the front - there is no serialise/deserialise.

No. Currently, in clang, the following two is the same:

clang++ -std=c++20 foo.cxxm -c -o foo.o

That's why I added the edit ...

clang++ -std=c++20 -xc++ foo.cxxm -c -o foo.o

AFAIU that will not produce any error for the user?

suppose I have foo.cxx which includes modules and has local declarations?
clang++ -std=c++20 foo.cxx -c -o foo.o

with

clang++ -std=c++20 foo.cxxm --precompile -o foo.pcm
clang++ -std=c++20 foo.pcm -o foo.o

The first compilation will generate .pcm files in the /tmp directories and compile the .pcm files in the /tmp directories.

yes, I should have added the -xc++ in the first case ;)

We could verify it by inserting debugging informations. Or writing a ODR-mismatch example, then when the compiler emits an error, we'll find the compiler emits the error in the deserialization part. I met these 2 situations for many times : )

The model I implemented simply streams the AST instead of throwing it away ...

So that model misses the ODR checks, which is not good. Since it matters with the semantics.

Understood - I wonder if we have layering right here - surely Sema should be responsible for ODR checks?
Is there any reason that relevant hashes cannot be generated on the AST without streaming it?
Was this model intended from the start ?
(the fact that one can connect a direct AST consumer to the back end, suggests that enforcing serialisation/deserialisation was not expected to be a requirement at some stage)

(repeating here, it is not the deserialisation that I think is key [although that is also useful] but minimising process launches).

So from my point of view, it is much more important to preserve the semantics than minimizing the process launches.

That is a possibility, at the expense of more process launches - and I do not think that the corresponding model for the server-interactive approach has yet been fully explored.

We go to some trouble (with the driver invoking the compiler in-process) to avoid unnecessary process launches .. with the current state of hardware, it could well be more efficient to have many processes waiting rather than many processes launched.

(I am not making a claim here, just observing that the data are incomplete, since we do not have a corresponding model offered for the client-server case, and we do not have any kind of measurements - these are all "thought experiments").

Yeah, I agree that there more space to explore in the client-server model. But for the current building model, the 2 phase compilation is obviously better than the 1 phase compilation. And my point is, if one day we proved that there are many good points in the client/server model, it is not too late to add it in clang. And currently it is not a good reason to block the current patch, at least it is pretty simple and innocent.

There seem to be two schools of thought. (1) It is good to use extensions to determine the file contents (2) we should not impose a requirement for magic extension names if the implementation can determine the content without.

For my part, I do prefer the second .. but of course will be happy to go with consensus,

I'm confused the second. I mean what if the other static analyzing tools (including build system but not limited to) want to analyze the modular project, if they can only invoke the compiler to query the result. Then things look complex enough in the state. But assume we can implement great compiler interfaces. There are still problems. The tools need to pass the potential module unit files to the compiler. With the first choice, the tools can pass the module unit files precisely while with the second one, the input number will be much larger. So it looks really worse to me...


Since we discussed many things. Let me try to summarize to avoid missing the topic.

(1) For the client-server model, I think it is too far now. I am not against it. Also I don't feel the current simple patch would is not good for it.

Yeah, I get that - what we need is an example of how both 'pre-scan' and 'discovery' schemes would work.

(2) Your implementation skips the deserialization part, which misses many ODR checks. I think this is not good.

We should fix the confusion here - if the serialisation is required for correctness, then we should diagnose an incorrect compilation. We should also consider if we have broken layering - ISTM that Sema should be responsible for the analysis not the deserialisation.

(3) I suddenly realize that your design and my design are not overlapped in some perspective. I mean the patch is working for *.cppm files, and your design is intended for *.cpp files. Then a question is: "what would the patch do for *.cppm files"? If we don't handle it specially, in your implementation, when we compile a .cppm file, the .pcm file would be generated twice. Once in the /tmp and once in the place you specified. I think this is not acceptable.

There are some driver changes associated with the patches I have (not yet pushed to Phab) that would avoid that case (we certainly would not want to generate the output twice)

You are right that the overlap is not in code, but only in concept - your patch would not prevent the "on demand BMI" from being added later - the only down-side is that two ways to do things is likely to create confusion.

So in another shorter word, for the current states of clang, the current patch is more natural and more appropriate with the current states of clang. And for your patch, I feel like it is not so fit for the current clang. I'm not saying not fit the current states is not good. Otherwise we don't have any inventions. I just want to say it requires more work.

yes, it seems that more work is needed.

However, I will say that when I started looking at this problem, I did consider trying to make the driver do the work to match GCC's model but that seemed like a very bad fit - since that would force the driver to inspect the source (or to start a process that does the same) to see if there are module keywords and then to query the name mapping interface (whatever that is) in order to decide what command lines to construct.

Clang header modules started with implicit builds and module caches. Apple is moving too explicit builds. There are all kinds of problems with implicit module builds. I believe that C++20 modules shouldn't make the same mistake.

iains added a comment.Oct 11 2022, 1:37 AM

Clang header modules started with implicit builds and module caches. Apple is moving too explicit builds. There are all kinds of problems with implicit module builds. I believe that C++20 modules shouldn't make the same mistake.

I do not believe that the client-sever build system model (a.k.a. P1184 / module mapper) is quite the same thing as implicit modules - there is a partial similarity in that the dependencies are discovered from the sources during compilation (rather than from a pre-scan).

However it does not try to turn the compiler into a build system; that work is done by the module-mapper (which can be some trivial in-process scheme or a full-blown build system, e.g. build2 - which I believe has an implementation that works with GCC).

(clearly we [the whole C++ community] need to figure out how we want these things to look to the end user and try to harmonise the implementations)

Clang header modules started with implicit builds and module caches. Apple is moving too explicit builds. There are all kinds of problems with implicit module builds. I believe that C++20 modules shouldn't make the same mistake.

I do not believe that the client-sever build system model (a.k.a. P1184 / module mapper) is quite the same thing as implicit modules - there is a partial similarity in that the dependencies are discovered from the sources during compilation (rather than from a pre-scan).

However it does not try to turn the compiler into a build system; that work is done by the module-mapper (which can be some trivial in-process scheme or a full-blown build system, e.g. build2 - which I believe has an implementation that works with GCC).

But this diff is doing something else. It puts the build-system again into clang.

(clearly we [the whole C++ community] need to figure out how we want these things to look to the end user and try to harmonise the implementations)

FAOD: I think this should not be about "who's patches" - but about the compilation model and command lines that we want to end up with.

BTW, in the current context, when I say "my" and "your", I just want to refer the corresponding things. There is no means to offend.

sure - no problem.

I guess we ought to be a bit more clear:
There is implementation divergence between GCC and clang and from the point to view of users and build systems having two different command line sets is not helpful.

The patches I drafted were aimed at removing that divergence.
If that is not possible (because the ODR analysis requires steps that make it very difficult or inefficient to do so) then some more thought and/or restructuring is needed.

Yeah, now this is more clear.

(2) Your scheme saves the time for deserialization. However, clang implement many ODR checks in the deserialization part. Skipping the deserialization will omit these ODR checks, which will break the semantics. And this is the reason why that scheme may not be good.

What you seem to be saying here is that:

clang++ -std=c++20 foo.cxxm -c -o foo,o

edit; maybe I should have written -xc++ foo.cxxm

Does different ODR analysis from:

clang++ -std=c++20 foo.cxxm --precompile -o foo.pcm
clang++ -std=c++20 foo.pcm -o foo.o

If so, then that is of some concern since those are both valid compilations in the current compiler.

In the first case, the back end is connected as an AST consumer to the front - there is no serialise/deserialise.

No. Currently, in clang, the following two is the same:

clang++ -std=c++20 foo.cxxm -c -o foo.o

That's why I added the edit ...

clang++ -std=c++20 -xc++ foo.cxxm -c -o foo.o

AFAIU that will not produce any error for the user?

suppose I have foo.cxx which includes modules and has local declarations?
clang++ -std=c++20 foo.cxx -c -o foo.o

with

clang++ -std=c++20 foo.cxxm --precompile -o foo.pcm
clang++ -std=c++20 foo.pcm -o foo.o

The first compilation will generate .pcm files in the /tmp directories and compile the .pcm files in the /tmp directories.

yes, I should have added the -xc++ in the first case ;)

Oh, I got it. Now it makes sense. I misses the edit : (.

The answer is:
(1) Now, the behavior is different. And I once sent a bug report for it.
(2) Now there won't be direct error message. And I was planned to add it. This is the reason why I closed the above bug report : )
(3) If we're going to make it an error, this is not decided yet.

We could verify it by inserting debugging informations. Or writing a ODR-mismatch example, then when the compiler emits an error, we'll find the compiler emits the error in the deserialization part. I met these 2 situations for many times : )

The model I implemented simply streams the AST instead of throwing it away ...

So that model misses the ODR checks, which is not good. Since it matters with the semantics.

Understood - I wonder if we have layering right here - surely Sema should be responsible for ODR checks?
Is there any reason that relevant hashes cannot be generated on the AST without streaming it?
Was this model intended from the start ?
(the fact that one can connect a direct AST consumer to the back end, suggests that enforcing serialisation/deserialisation was not expected to be a requirement at some stage)

In fact, we do the ODR checks in Sema too. (Would you feel it amazing? I felt amazing when I realize this at the first time)

For the following example, the ODR check is made in Deserializer:

module;
#include "something"
export module any_module_name;
import m;

or

// decls
import m;

or

import m1;
import m2;

In the above cases, the ODR check is made in deserialization part. And in the following examples, the ODR check is made in Sema part:

import m;
// decls

What is the principle? I think it is caused by the on-the-fly compilation principle (The name is constructed by myself. I think you know what I am saying.)

I mean, when we (the compiler) parse a declaration and we want to know if it violates the ODR, we made it in the Sema part. And when we import something (precisely, read something from a module, since clang did lazy loading), we did the ODR check in deserialization.

Although I felt odd at first, but now I feel it reasonable in some levels. How do you feel now?

ChuanqiXu added a comment.EditedOct 11 2022, 1:55 AM

Clang header modules started with implicit builds and module caches. Apple is moving too explicit builds. There are all kinds of problems with implicit module builds. I believe that C++20 modules shouldn't make the same mistake.

Could you provide more details? Since we can't understand you if you only said it is bad without reasons then we can't be in the same page.

From my natural imagination, some kind of module cache is necessary. Otherwise, we may not be able to compile a hello world example in one command line like:

clang++ -std=c++20 Hello.cppm main.cpp

For this, gcc would already make a cache called gcm.cache in the current directory. So I feel like this is irrelevant with something larger.

And for a larger project, you can see the demo at: https://github.com/ChuanqiXu9/llvm-project/tree/MyP1689/P1689Examples. The cache is controlled by the build system instead of the compiler. So we are not moving the build system into the compiler.

My rough feeling is that you're telling us that something is bad in 100 steps aways. But we're in fact to move 1 step further. But let's be back in the same page first.

iains added a comment.Oct 11 2022, 1:56 AM

FAOD: I think this should not be about "who's patches" - but about the compilation model and command lines that we want to end up with.

BTW, in the current context, when I say "my" and "your", I just want to refer the corresponding things. There is no means to offend.

sure - no problem.

I guess we ought to be a bit more clear:
There is implementation divergence between GCC and clang and from the point to view of users and build systems having two different command line sets is not helpful.

The patches I drafted were aimed at removing that divergence.
If that is not possible (because the ODR analysis requires steps that make it very difficult or inefficient to do so) then some more thought and/or restructuring is needed.

Yeah, now this is more clear.

(2) Your scheme saves the time for deserialization. However, clang implement many ODR checks in the deserialization part. Skipping the deserialization will omit these ODR checks, which will break the semantics. And this is the reason why that scheme may not be good.

What you seem to be saying here is that:

clang++ -std=c++20 foo.cxxm -c -o foo,o

edit; maybe I should have written -xc++ foo.cxxm

Does different ODR analysis from:

clang++ -std=c++20 foo.cxxm --precompile -o foo.pcm
clang++ -std=c++20 foo.pcm -o foo.o

If so, then that is of some concern since those are both valid compilations in the current compiler.

In the first case, the back end is connected as an AST consumer to the front - there is no serialise/deserialise.

No. Currently, in clang, the following two is the same:

clang++ -std=c++20 foo.cxxm -c -o foo.o

That's why I added the edit ...

clang++ -std=c++20 -xc++ foo.cxxm -c -o foo.o

AFAIU that will not produce any error for the user?

suppose I have foo.cxx which includes modules and has local declarations?
clang++ -std=c++20 foo.cxx -c -o foo.o

with

clang++ -std=c++20 foo.cxxm --precompile -o foo.pcm
clang++ -std=c++20 foo.pcm -o foo.o

The first compilation will generate .pcm files in the /tmp directories and compile the .pcm files in the /tmp directories.

yes, I should have added the -xc++ in the first case ;)

Oh, I got it. Now it makes sense. I misses the edit : (.

The answer is:
(1) Now, the behavior is different. And I once sent a bug report for it.
(2) Now there won't be direct error message. And I was planned to add it. This is the reason why I closed the above bug report : )
(3) If we're going to make it an error, this is not decided yet.

If we are going to require the serialisation/deserialisation for correctness then IMO we should have an error for an incorrect compilation.

We could verify it by inserting debugging informations. Or writing a ODR-mismatch example, then when the compiler emits an error, we'll find the compiler emits the error in the deserialization part. I met these 2 situations for many times : )

The model I implemented simply streams the AST instead of throwing it away ...

So that model misses the ODR checks, which is not good. Since it matters with the semantics.

Understood - I wonder if we have layering right here - surely Sema should be responsible for ODR checks?
Is there any reason that relevant hashes cannot be generated on the AST without streaming it?
Was this model intended from the start ?
(the fact that one can connect a direct AST consumer to the back end, suggests that enforcing serialisation/deserialisation was not expected to be a requirement at some stage)

In fact, we do the ODR checks in Sema too. (Would you feel it amazing? I felt amazing when I realize this at the first time)

For the following example, the ODR check is made in Deserializer:

module;
#include "something"
export module any_module_name;
import m;

or

// decls
import m;

or

import m1;
import m2;

In the above cases, the ODR check is made in deserialization part. And in the following examples, the ODR check is made in Sema part:

import m;
// decls

What is the principle? I think it is caused by the on-the-fly compilation principle (The name is constructed by myself. I think you know what I am saying.)

I mean, when we (the compiler) parse a declaration and we want to know if it violates the ODR, we made it in the Sema part. And when we import something (precisely, read something from a module, since clang did lazy loading), we did the ODR check in deserialization.

Although I felt odd at first, but now I feel it reasonable in some levels. How do you feel now?

I still feel that deserialisation should do deserialisation, and Sema should be doing the diagnoses when it tries to use a decl (in fact, I would expect that to be required for correctness in the C++20 scheme, since the point of use is significant). I will defer to folks who know the history better - but this seems like a layering bug to me still.

If we are going to require the serialisation/deserialisation for correctness then IMO we should have an error for an incorrect compilation.

Of course.

I still feel that deserialisation should do deserialisation, and Sema should be doing the diagnoses when it tries to use a decl (in fact, I would expect that to be required for correctness in the C++20 scheme, since the point of use is significant). I will defer to folks who know the history better - but this seems like a layering bug to me still.

Another reason I forgot to mention is that the deserialization need to do merging. It is common in practice that a similar declarations lives in multiple module files (e.g., by GMF). Then if the deserialization needs to merge things, it is naturally that the deserilizer need to do ODR checks. But if you feel like the merging job belongs to Sema too, I think it might not be bad/wrong.

iains added a comment.Oct 11 2022, 2:32 AM

If we are going to require the serialisation/deserialisation for correctness then IMO we should have an error for an incorrect compilation.

Of course.

I still feel that deserialisation should do deserialisation, and Sema should be doing the diagnoses when it tries to use a decl (in fact, I would expect that to be required for correctness in the C++20 scheme, since the point of use is significant). I will defer to folks who know the history better - but this seems like a layering bug to me still.

Another reason I forgot to mention is that the deserialization need to do merging. It is common in practice that a similar declarations lives in multiple module files (e.g., by GMF). Then if the deserialization needs to merge things, it is naturally that the deserilizer need to do ODR checks. But if you feel like the merging job belongs to Sema too, I think it might not be bad/wrong.

I would be concerned that doing this work in the deserialisation could:

  • produce false positives (diagnostics) where there should be no error until an object is actually used.
  • make it more difficult to track cases where what is merged at the point of use is semantically significant.

It would seem to better that the consumer of the AST should not have to know whether it had been round-tripped through the serialisation/deserialization process.

From the point of view of this diff - it is relevant to consider other implementations that achieve the same objective - we seem to have uncovered some additional questions to answer.

ChuanqiXu added a comment.EditedOct 11 2022, 2:45 AM

If we are going to require the serialisation/deserialisation for correctness then IMO we should have an error for an incorrect compilation.

Of course.

I still feel that deserialisation should do deserialisation, and Sema should be doing the diagnoses when it tries to use a decl (in fact, I would expect that to be required for correctness in the C++20 scheme, since the point of use is significant). I will defer to folks who know the history better - but this seems like a layering bug to me still.

Another reason I forgot to mention is that the deserialization need to do merging. It is common in practice that a similar declarations lives in multiple module files (e.g., by GMF). Then if the deserialization needs to merge things, it is naturally that the deserilizer need to do ODR checks. But if you feel like the merging job belongs to Sema too, I think it might not be bad/wrong.

I would be concerned that doing this work in the deserialisation could:

  • produce false positives (diagnostics) where there should be no error until an object is actually used.
  • make it more difficult to track cases where what is merged at the point of use is semantically significant.

It would seem to better that the consumer of the AST should not have to know whether it had been round-tripped through the serialisation/deserialization process.

The reading process of modules are lazy. So the problem is relatively mitigated. But I understand your concern in some level.

From the point of view of this diff - it is relevant to consider other implementations that achieve the same objective - we seem to have uncovered some additional questions to answer.

I feel the main blocking issue for this diff is the concern about modules cache throw from @dblaikie and @tschuett that we (at least me) don't know the reason. I'm still waiting for more detailed reason to understand them.

And I feel like your opinion is mainly in the higher level. And I don't feel it is blocking this patch. Because if we're going to refactor it actually someday, this diff won't never be a blocker. And under the current frame, I think this implementation is the simplest way to achieve the same goal. So what we talked about is mainly about some higher level things.

iains added a comment.Oct 11 2022, 3:41 AM

If we are going to require the serialisation/deserialisation for correctness then IMO we should have an error for an incorrect compilation.

Of course.

I still feel that deserialisation should do deserialisation, and Sema should be doing the diagnoses when it tries to use a decl (in fact, I would expect that to be required for correctness in the C++20 scheme, since the point of use is significant). I will defer to folks who know the history better - but this seems like a layering bug to me still.

Another reason I forgot to mention is that the deserialization need to do merging. It is common in practice that a similar declarations lives in multiple module files (e.g., by GMF). Then if the deserialization needs to merge things, it is naturally that the deserilizer need to do ODR checks. But if you feel like the merging job belongs to Sema too, I think it might not be bad/wrong.

I would be concerned that doing this work in the deserialisation could:

  • produce false positives (diagnostics) where there should be no error until an object is actually used.
  • make it more difficult to track cases where what is merged at the point of use is semantically significant.

It would seem to better that the consumer of the AST should not have to know whether it had been round-tripped through the serialisation/deserialization process.

The reading process of modules are lazy. So the problem is relatively mitigated. But I understand your concern in some level.

(IMO) We ought to be able to move the ODR work to Sema (but I did not yet look at how hard this would be) - it seems that deserialisation should supply (lazily) decls in response to requests and the query side should determine if they are legally mergeable. However, let's move this discussion elsewhere now - I think we've covered the main points.

From the point of view of this diff - it is relevant to consider other implementations that achieve the same objective - we seem to have uncovered some additional questions to answer.

I feel the main blocking issue for this diff is the concern about modules cache throw from @dblaikie and @tschuett that we (at least me) don't know the reason. I'm still waiting for more detailed reason to understand them.

And I feel like your opinion is mainly in the higher level. And I don't feel it is blocking this patch. Because if we're going to refactor it actually someday, this diff won't never be a blocker.

I think that is correct - the implementations are independent (my, so far unpublished, driver patches probably would require modification, but that's not a big deal).

And under the current frame, I think this implementation is the simplest way to achieve the same goal. So what we talked about is mainly about some higher level things.

Sure, but having a high-level plan is important too :)

One specific comment about this patch; one of the key things that GCC does is to remove the need for the user to write a specific extension to use modular C++. So from that point of view, at least, this patch does not provide a solution to match the command line interfaces of GCC.

To the original point I was making about implicit modules (which might've been my own confusion due to the term being brought up in D134269):

Implicit modules is the situation where the compiler, finding that it needs a module while compiling some usage, knows how to go out and spawn a new process to create that module and then cache it. The build system never knows about modules, their builds or their dependencies.

This patch had some superficial similarities - specifically around having an on-disk cache of modules that the user/build system/etc invoking the compiler isn't necessarily aware of/managing/invalidating/observing/etc. But it does differ in an important way from implicit modules in that the compiler won't implicitly build modules - you still have to specify the modules and their usage as separate compilations in-order (ie: modules need to be explicitly built before their usage). I think that makes a big difference to this being feasible, at least in the small scale.

The remaining concern is that this feature should likely not be used by a build system - because it won't know the dependencies (or, if it does know the dependencies then the build system, not the compiler, should be managing the BMIs) & so won't know how to schedule things for maximum parallelism without incorrect ordering, and correct rebuilding of dependencies when necessary.

So maybe we could limit the "Hello, World" situation only to cases where the compiler is producing temporary object files? I know this probably sounds absurd, but coming back to a point I made earlier about .pcm files being named with the same scheme as .o files - so if the .o file goes into the temp directory, the .pcm goes there in the same manner - the driver could, potentially, add that module (-fmodule-file or whatever is necessary) to all subsequent compilations on the same command line (eg: clang++ x.cppm y.cppm z.cpp a.o - so x.cppm gets built temporarily, y.cppm gets built temporarily and gets -fmodule-file=/tmp/x.pcm or whatever, then z.cpp gets built with -fmodule-file=/tmp/x.pcm -fmodule-file=/tmp/y.pcm and then the whole thing gets linked as usual)

This ensures the feature doesn't creep into build systems where it might be more confusing than helpful, given the implicit dependencies between build commands. It's admittedly going to be confusing to users that they can't trivially split their command line out into multiple, but I think that's probably the tipping point in complexity for modules builds where users should reach for a build system or live with the complexity of passing .pcm files explicitly between these steps.


Sorry I haven't been following the discussion between this and what @iains is proposing. I guess/roughly understand that @iains is proposing something like the module-mapper support? Which I rather like the idea of, though understand it has more compiler surface area/complex interaction with the mapping service.

I guess @iains is proposing that the mapper could be used to implement/address this "Hello, World" situation in some way? (I guess that looks maybe /more/ like implicit modules than this proposal - in that the module mapper does mean that during one compilation, another compilation might run - and for a build-system agnostic module mapper that boils down to the problems of build parallelism that implicit modules have (though it had a lot of success with that model, not having to modify build systems/being able to ship sooner/etc - it's been around for years, and as much as we've learned of its limitations, it's hard to argue with the success/suggest that it wasn't the right tradeoff (I have some reservations still, but that's more an social/after-work ramble than technical)))

In any case, it looks like there's design discussions to be had? Not sure if this is the right place to have them, but maybe it is. It might be easier to discuss them on discourse with hopefully some relatively narrow examples with command lines shown, etc. (as already some of the conversation seems to be fragmented between this change and the doc change)

iains added a comment.Oct 11 2022, 8:25 AM

(trying not derail this discussion further)

  • Yes, the alternate solution proposed for the "Hello World" case would work with the module mapper interface. However, the initial discussion was simply about being able to produce both the .PCM and .O artefacts from one invocation of the FE.
  • It seemed reasonable to mention (since it's not vapourware, there are draft patches), but clearly some technical issues to address.
  • I do not think this patch fully addresses the issue of harmonising GCC and clang's command lines for modular builds (because it does not deal with discovery of modular code in 'normally named' sources), and my investigation of trying to do this in the driver suggests that it would be much more complex than doing it in the FE.

In the discussions, we have uncovered some other things that should be discussed elsewhere too.
(so I agree any further discussion of the alternate interface can be moved elsewhere - either wait for the rebased patches or on discourse).

What really worries me that you are talking about one phase compilation and there is a module cache. Even if it is one phase compilation you must pass all dependent modules on the command line.

The one phase compilation should work with -fno-implicit-modules!

(trying not derail this discussion further)

  • Yes, the alternate solution proposed for the "Hello World" case would work with the module mapper interface. However, the initial discussion was simply about being able to produce both the .PCM and .O artefacts from one invocation of the FE.

*nod* If there's some other patch that does just that - produces a .o/.pcm together in a single invocation, maybe that's a less contentious/easier patch to review/discuss/approve/commit. The number of process invocations involved in that (whether the driver, under the hood, invokes clang once to generate a .pcm and another to generate the .o) I'm less fussed about - I think that's something that can be improved separately - especially once/if someone wants to change the .pcm format to be lighter weight (not enough to generate the .o from, but enough for compilations /using/ the module to do so) - that'll drive a need for a monolithic .cppm -> {.o,.pcm} action.

  • It seemed reasonable to mention (since it's not vapourware, there are draft patches), but clearly some technical issues to address.

For sure.

  • I do not think this patch fully addresses the issue of harmonising GCC and clang's command lines for modular builds (because it does not deal with discovery of modular code in 'normally named' sources), and my investigation of trying to do this in the driver suggests that it would be much more complex than doing it in the FE.

Yes, that would be hard to implement in the driver - but my personal taste is that if the compiler's going to treat the input differently, or produce different outputs, it should be observable on the command line - either through different flags (-x) or a different file extension. I think that's been historically true? Though I realize it also comes up against "but we have all this C++ with these existing file extensions" and especially if GCC doesn't agree on that philosophy, Clang's going to have a hard time making a stand there... :/

iains added a comment.Oct 11 2022, 9:09 AM
  • I do not think this patch fully addresses the issue of harmonising GCC and clang's command lines for modular builds (because it does not deal with discovery of modular code in 'normally named' sources), and my investigation of trying to do this in the driver suggests that it would be much more complex than doing it in the FE.

Yes, that would be hard to implement in the driver - but my personal taste is that if the compiler's going to treat the input differently, or produce different outputs, it should be observable on the command line - either through different flags (-x) or a different file extension. I think that's been historically true? Though I realize it also comes up against "but we have all this C++ with these existing file extensions" and especially if GCC doesn't agree on that philosophy, Clang's going to have a hard time making a stand there... :/

I think that ship has sailed, GCC does, indeed, already take an input like "foo.cxx" and automatically determine that it should emit a BMI as well as the .O and does so.

It is (IMO, at least) reasonable to forecast that there will be proponents of both "we should have extensions to disambiguate" and " the compiler should be able to work it out standardised C++ and not force us to rename files in a strange way". The degree of support for either camp will no doubt include religious fervour ...

As compiler engineers, I'd say that regardless of which view we might take for our own code, we have to cater for both.

To the original point I was making about implicit modules (which might've been my own confusion due to the term being brought up in D134269):

Implicit modules is the situation where the compiler, finding that it needs a module while compiling some usage, knows how to go out and spawn a new process to create that module and then cache it. The build system never knows about modules, their builds or their dependencies.

This patch had some superficial similarities - specifically around having an on-disk cache of modules that the user/build system/etc invoking the compiler isn't necessarily aware of/managing/invalidating/observing/etc. But it does differ in an important way from implicit modules in that the compiler won't implicitly build modules - you still have to specify the modules and their usage as separate compilations in-order (ie: modules need to be explicitly built before their usage). I think that makes a big difference to this being feasible, at least in the small scale.

Oh, now I got your point. It is caused by the imprecise name. My bad.

The remaining concern is that this feature should likely not be used by a build system - because it won't know the dependencies (or, if it does know the dependencies then the build system, not the compiler, should be managing the BMIs) & so won't know how to schedule things for maximum parallelism without incorrect ordering, and correct rebuilding of dependencies when necessary.

I agree it won't reach the maximum parallelism. But I think it should be able to rebuild correctly if the build system understands the dependencies between module unit. For example, if B.cpp imports module A, and A is defined in A.cppm. And when A.cppm changes, it will be fine if the build system will compile A.cppm first and compile B.cpp then. I think this is achievable by the build system. (For example, the P1689 proposal I'm working on). So the problem becomes a performance problem instead of a correctness problem. So it looks not bad to me. I still feel it is not good to make perfect as the enemy of better.

P.S: there is another case: after we built the program and we remove A.pcm without touching A.cppm. Then the current model can't hold it unless the build system can recognize the pcm actually. But this case won't make me too uncomfortable.

In any case, it looks like there's design discussions to be had? Not sure if this is the right place to have them, but maybe it is. It might be easier to discuss them on discourse with hopefully some relatively narrow examples with command lines shown, etc. (as already some of the conversation seems to be fragmented between this change and the doc change)

Yeah, we can discuss it somewhere else.

@tschuett

What really worries me that you are talking about one phase compilation and there is a module cache. Even if it is one phase compilation you must pass all dependent modules on the command line.

I thought the compiler can search in the module cache automatically. Do you mean the style is bad?

The one phase compilation should work with -fno-implicit-modules!

Great catch! I think we can solve the problem by another option (-fcxx-modules-cache-path ?) to decouple with clang modules. Originally I feel it may be better to reuse the option to avoid confusion. But it looks like there are too much logics about modules cache in the clang modules.

I do not think this patch fully addresses the issue of harmonising GCC and clang's command lines for modular builds (because it does not deal with discovery of modular code in 'normally named' sources), and my investigation of trying to do this in the driver suggests that it would be much more complex than doing it in the FE.

Yeah, it is not sufficient to say this patch addresses the command line conformance issue. This patch only ease the use of modules.

boris added a subscriber: boris.Oct 12 2022, 2:11 AM

Just to add a few points on the module mapper discussion: overall, the current view is that there are two ways to handle C++20 modules from the build system's POV: (1) pre-scan the codebase to figure out what is what and who depends on whom ahead of compilation and (2) module mapper where the compiler and the build system discover this information interactively during compilation. Naturally, both approaches have advantages and disadvantages.

The major issue with the pre-scan is that there needs to be a clear boundary where the codebase ends. This can be an acceptable requirement for monorepo-style projects but not for multirepo. Even for monorepo, there are often external dependencies (C/C++ standard library, maybe system libraries like OpenSSL) that require special considerations. There is also some concern for the performance of the pre-scan on larger codebases. On the other hand, pre-scan is likely to be easier to integrate with legacy and meta-build systems like CMake.

The major issue with the module mapper is the need for deeper integration with the build system as well as the overall complexity of the client/server architecture. There is also some concern for the resource consumption since the mapper approach may need to spawn nested compiler invocations to build missing BMIs. The main advantage of the module mapper is probably the fact that the information comes from the source (the compiler) and if/when necessary, which sidesteps the whole issue of doing too little or too much pre-scanning (for example, due to imprecise boundaries, some discrepancies in the compiler options, etc).

GCC implements the mapper approach (with the reusable parts factored into libcody) and we successfully use that in build2. We would definitely like to see the module mapper supported by Clang, ideally with the GCC's interface (so that we could use the existing mapper for both). In the build2's case specifically, pre-scan is not a viable option since it's a multirepo-first build system.

Also, a comment on the earlier point made:

I do not believe that the client-sever build system model (a.k.a. P1184 / module mapper) is quite the same thing as implicit modules

Strongly agree. The key difference here is that with the mapper there is a single entity (the build system) that decides which things need to be (re)built and where rather than multiple compiler instances trying to come up with something consistent.

Just to add a few points on the module mapper discussion: overall, the current view is that there are two ways to handle C++20 modules from the build system's POV: (1) pre-scan the codebase to figure out what is what and who depends on whom ahead of compilation and (2) module mapper where the compiler and the build system discover this information interactively during compilation. Naturally, both approaches have advantages and disadvantages.

The major issue with the pre-scan is that there needs to be a clear boundary where the codebase ends. This can be an acceptable requirement for monorepo-style projects but not for multirepo. Even for monorepo, there are often external dependencies (C/C++ standard library, maybe system libraries like OpenSSL) that require special considerations. There is also some concern for the performance of the pre-scan on larger codebases. On the other hand, pre-scan is likely to be easier to integrate with legacy and meta-build systems like CMake.

The major issue with the module mapper is the need for deeper integration with the build system as well as the overall complexity of the client/server architecture. There is also some concern for the resource consumption since the mapper approach may need to spawn nested compiler invocations to build missing BMIs. The main advantage of the module mapper is probably the fact that the information comes from the source (the compiler) and if/when necessary, which sidesteps the whole issue of doing too little or too much pre-scanning (for example, due to imprecise boundaries, some discrepancies in the compiler options, etc).

GCC implements the mapper approach (with the reusable parts factored into libcody) and we successfully use that in build2. We would definitely like to see the module mapper supported by Clang, ideally with the GCC's interface (so that we could use the existing mapper for both). In the build2's case specifically, pre-scan is not a viable option since it's a multirepo-first build system.

Also, a comment on the earlier point made:

I do not believe that the client-sever build system model (a.k.a. P1184 / module mapper) is quite the same thing as implicit modules

Strongly agree. The key difference here is that with the mapper there is a single entity (the build system) that decides which things need to be (re)built and where rather than multiple compiler instances trying to come up with something consistent.

Hi Boris, thanks for the update. (Although I feel this must not be the right place to discuss the topic).

Personally, I don't have strong feeling about the client-server model (pursue it or object it). And I'd love to support the per-scan based method since it is relatively easy for compiler and build systems. So that other build systems (including but not limited to cmake) can support modules more quickly. Then the more users can try to start to use modules actually. So that then we can get more backport to improve. And I feel like the two option may not be mutual exclusive with each other. For example, my experimental support for P1689 is at: https://github.com/ChuanqiXu9/llvm-project/commit/fd809e8fcba497eb9458d7dbb6fc194044b19521. The implementation looks relatively trivial to me. The simplicity is pretty important. Then I won't feel it is too late to support the client-server model in clang if someday the client-server model shows great advances.

boris added a comment.Oct 12 2022, 3:11 AM

For example, my experimental support for P1689 is at: [...]. The implementation looks relatively trivial to me. The simplicity is pretty important.

Two points:

  1. It's worth considering the simplicity of the overall system (compiler + build system + user project) rather than just the compiler. I hope my previous comment highlighted some of the complexity that the overall system must deal with in the pre-scan approach with a lot of it potentially ending up on the user's plate.
  1. I haven't looked at the code, but there are some complex problems in this area as highlighted by this MSVC bug: https://developercommunity.visualstudio.com/t/scanDependencies-does-not-take-into-acc/10029154 I believe you may have it easier because reportedly Richard & friends have already implemented the necessary header import isolation semantics.
iains added a comment.Oct 12 2022, 8:35 AM

To avoid more tangential discussion here - I've opened https://discourse.llvm.org/t/generating-pcm-module-interfaces-and-regular-object-files-from-the-same-compiler-invocation/65918 ... it would be great if the major stakeholders here especially @rsmith could comment on the design intentions.

For example, my experimental support for P1689 is at: [...]. The implementation looks relatively trivial to me. The simplicity is pretty important.

Two points:

  1. It's worth considering the simplicity of the overall system (compiler + build system + user project) rather than just the compiler. I hope my previous comment highlighted some of the complexity that the overall system must deal with in the pre-scan approach with a lot of it potentially ending up on the user's plate.
  1. I haven't looked at the code, but there are some complex problems in this area as highlighted by this MSVC bug: https://developercommunity.visualstudio.com/t/scanDependencies-does-not-take-into-acc/10029154 I believe you may have it easier because reportedly Richard & friends have already implemented the necessary header import isolation semantics.

Yeah, it looks like the header unit brings new complexity. And my demo only works for named modules. I need to reconsider it.

My original thought for these two modes was: the compiler could avoid to make the choice. I mean, the compiler could support both modes and let the build system writer and end user to make the choice.

And I believe there will be a discuss for client-server model vs pre-scan model in clang. I think we can discuss more then.

To avoid more tangential discussion here - I've opened https://discourse.llvm.org/t/generating-pcm-module-interfaces-and-regular-object-files-from-the-same-compiler-invocation/65918 ... it would be great if the major stakeholders here especially @rsmith could comment on the design intentions.

Yeah, this is about the deserialization part. I believe there'll be 2 other discussions about filename suffixes and client-module model. Although I feel none of them are blocker or even related to this patch : )

iains added a comment.Oct 13 2022, 1:13 AM

For example, my experimental support for P1689 is at: [...]. The implementation looks relatively trivial to me. The simplicity is pretty important.

  1. I haven't looked at the code, but there are some complex problems in this area as highlighted by this MSVC bug: https://developercommunity.visualstudio.com/t/scanDependencies-does-not-take-into-acc/10029154 I believe you may have it easier because reportedly Richard & friends have already implemented the necessary header import isolation semantics.

Yeah, it looks like the header unit brings new complexity. And my demo only works for named modules. I need to reconsider it.

Header Units are "special", they defeat automatic processing in my view.

  • The compiler cannot determine that a source is a HU (from the header source); it has to be told.
  • A build system can determine that a source depends on a header (of course, that is already done for years), however, it is not at all clear that the build system is able to determine if that dependent header is an "importable header" and therefore suitable for building as a HU. Maybe the build system folks have some ideas, of course (and a sub-set of headers are pre-defined as importable).

So, I suspect, that we will be expecting (at least in the short to medium term) the user's project to define which headers should be converted to HUs; we already have specific driver options to help with the actual production.

ChuanqiXu updated this revision to Diff 468138.Oct 17 2022, 1:59 AM

Address @tschuett's opinion:

  • Use new introduced -fc++-module-cache-path= instead of -fc++-module-path to avoid many logics about modules cache in clang modules.
  • Use new introduced -fmodule-bmi-output= instead of -fmodule-name for the same reason. Also -fmodule-bmi-output= is more flexible since -fmodule-name require its argument to be the same with the actual module name.
ChuanqiXu added a comment.EditedOct 17 2022, 2:11 AM

@dblaikie @iains @tschuett @boris There are many opinions in this page. They are about problems in the higher level. They are pretty important and helpful indeed. I'm really appreciated it. But most of them are not related to this diff itself and I don't think they are blocking issues of this diff.

This diff itself is a simple compilation invocation sugar to me (the term compilation invocation sugar is created by me. It is similar to the idea of syntax sugar in programming language. I am not sure if there is any existing/proper term). It is simple/cheap and it won't do anything harmful. And it is helpful for users to use modules and for implementors of build systems to do some quick POCs. For users, we can use modules by the help of this patch in the current build systems. The one of the example could be: https://reviews.llvm.org/D135507. If we don't have this patch, I guess we can only make it by some complex add_custom_* things, which will be much harder. And for the implementors of build systems, it should be helpful too. Although I guess all of us here know that the one-phase compilation can't get the maximum parallelism, it should be good to make progress.

I really wish we can get this one accepted.

iains added a comment.Oct 17 2022, 5:33 AM

Address @tschuett's opinion:

  • Use new introduced -fc++-module-cache-path= instead of -fc++-module-path to avoid many logics about modules cache in clang modules.
  • Use new introduced -fmodule-bmi-output= instead of -fmodule-name for the same reason. Also -fmodule-bmi-output= is more flexible since -fmodule-name require its argument to be the same with the actual module name.

I think the second one is too much "implementor-speak" as a user-facing option name, perhaps keep it consistent with the first and make it something like "-fc++-module-filename=" (unless it can be a path - in which case -fc++-module-pathname=).

Although modular code is user-facing - BMIs are an implementational detail, right?

(sorry I've not been following all this as closely as I should - but I don't think BMIs are an implementation detail, anymore than object files are - users should/will be as aware of BMIs as they are of .o files - there build artifacts that need to be cached, can be out of date, can be deleted/maybe copied/moved around in limited ways, have to be passed to other tools in the build pipeline. Not sure what term to use, I guess compilers don't often have warnings/errors/diagnostics that explicitly talk about "object files" - so maybe in a similar sense they won't explicitly talk about "binary module artifact files" but maybe?)

To the original point I was making about implicit modules (which might've been my own confusion due to the term being brought up in D134269):

Implicit modules is the situation where the compiler, finding that it needs a module while compiling some usage, knows how to go out and spawn a new process to create that module and then cache it. The build system never knows about modules, their builds or their dependencies.

This patch had some superficial similarities - specifically around having an on-disk cache of modules that the user/build system/etc invoking the compiler isn't necessarily aware of/managing/invalidating/observing/etc. But it does differ in an important way from implicit modules in that the compiler won't implicitly build modules - you still have to specify the modules and their usage as separate compilations in-order (ie: modules need to be explicitly built before their usage). I think that makes a big difference to this being feasible, at least in the small scale.

Oh, now I got your point. It is caused by the imprecise name. My bad.

The remaining concern is that this feature should likely not be used by a build system - because it won't know the dependencies (or, if it does know the dependencies then the build system, not the compiler, should be managing the BMIs) & so won't know how to schedule things for maximum parallelism without incorrect ordering, and correct rebuilding of dependencies when necessary.

I agree it won't reach the maximum parallelism. But I think it should be able to rebuild correctly if the build system understands the dependencies between module unit. For example, if B.cpp imports module A, and A is defined in A.cppm. And when A.cppm changes, it will be fine if the build system will compile A.cppm first and compile B.cpp then. I think this is achievable by the build system. (For example, the P1689 proposal I'm working on). So the problem becomes a performance problem instead of a correctness problem. So it looks not bad to me. I still feel it is not good to make perfect as the enemy of better.

The build system still needs to know that B.cppm depends on A.cppm - and once it knows that, it's not a huge cost for it to know the name of the file that represents that dependency and is produced by A.cppm and passed to B.cppm, I think?

In short - seems like we should separate out the cache discussion from the "one phase" compilation in the sense of a single build action that takes a .cppm and generates both a .o and a .pcm in one compiler/driver invocation. (maybe something like this is what @iains has already sent out in another review?)

to @iains point about "it'd be good if we didn't have to invoke two underlying commands from the one drivter invocation" - yeah, agreed. Though I wouldn't mind one step being "add the driver interface" and another being "fix whatever serialization isuse/etc/ might stand in the way of doing .cppm->{.o,.pcm} in a single action without serialization, so we can then start stripping stuff out of the .pcm since it'll only need to contain the interface, and not have to worry about having enough info for .o generation anymore"

Although modular code is user-facing - BMIs are an implementational detail, right?

but I don't think BMIs are an implementation detail, anymore than object files are - users should/will be as aware of BMIs as they are of .o files - there build artifacts that need to be cached, can be out of date, can be deleted/maybe copied/moved around in limited ways, have to be passed to other tools in the build pipeline. Not sure what term to use, I guess compilers don't often have warnings/errors/diagnostics that explicitly talk about "object files" - so maybe in a similar sense they won't explicitly talk about "binary module artifact files" but maybe?

Yeah, agreed. I guess @iains 's meaning may be "if the tool chains are ready enough one day, then the end user won't need to touch BMIs directly". Yeah, it looks true. Nowadays, many c++ programmers don't need to touch the .o files since the build systems have taken care for it (from my personal experience). But from the perspective of a compiler, the concept of BMI may be necessary. I remember when I wrote the documents for modules, I used the term "module file" at first. Then @ruoso corrected me that we should use BMI here. So I think the use of the term BMI may be correct here.

The build system still needs to know that B.cppm depends on A.cppm - and once it knows that, it's not a huge cost for it to know the name of the file that represents that dependency and is produced by A.cppm and passed to B.cppm, I think?

In a private chat with Kitware guys, they told me if the one phase compilation wasn't supported, they can mock it by replacing the original command by the combination of:

clang++ -std=c++20 --precompile src.cppm -o src.pcm
clang++ -std=c++20 src.pcm --precompile src.o

but the pcm files won't be depended directly. So it may be harder, I am not sure. @ben.boeckel any update?

In short - seems like we should separate out the cache discussion from the "one phase" compilation in the sense of a single build action that takes a .cppm and generates both a .o and a .pcm in one compiler/driver invocation. (maybe something like this is what @iains has already sent out in another review?)

Agreed.

to @iains point about "it'd be good if we didn't have to invoke two underlying commands from the one drivter invocation" - yeah, agreed. Though I wouldn't mind one step being "add the driver interface" and another being "fix whatever serialization isuse/etc/ might stand in the way of doing .cppm->{.o,.pcm} in a single action without serialization, so we can then start stripping stuff out of the .pcm since it'll only need to contain the interface, and not have to worry about having enough info for .o generation anymore"

(I guess you're saying without deserialization)

Agreed in the higher level.

But what do you mean about stripping stuff out of the .pcm? Do you mean to remove some function bodies when writing the BMI? If yes, it'll be problematic. When we did so, when other TU imports the BMI, it won't see the function bodies it could see before. This will prevent optimization. It'll be good as an optional flag so that the user know what the effect is. But it might not be good to do so by default. (Although this paragraph starts to talk about other topics)

Although modular code is user-facing - BMIs are an implementational detail, right?

but I don't think BMIs are an implementation detail, anymore than object files are - users should/will be as aware of BMIs as they are of .o files - there build artifacts that need to be cached, can be out of date, can be deleted/maybe copied/moved around in limited ways, have to be passed to other tools in the build pipeline. Not sure what term to use, I guess compilers don't often have warnings/errors/diagnostics that explicitly talk about "object files" - so maybe in a similar sense they won't explicitly talk about "binary module artifact files" but maybe?

Yeah, agreed. I guess @iains 's meaning may be "if the tool chains are ready enough one day, then the end user won't need to touch BMIs directly". Yeah, it looks true. Nowadays, many c++ programmers don't need to touch the .o files since the build systems have taken care for it (from my personal experience). But from the perspective of a compiler, the concept of BMI may be necessary. I remember when I wrote the documents for modules, I used the term "module file" at first. Then @ruoso corrected me that we should use BMI here. So I think the use of the term BMI may be correct here.

The build system still needs to know that B.cppm depends on A.cppm - and once it knows that, it's not a huge cost for it to know the name of the file that represents that dependency and is produced by A.cppm and passed to B.cppm, I think?

In a private chat with Kitware guys, they told me if the one phase compilation wasn't supported, they can mock it by replacing the original command by the combination of:

clang++ -std=c++20 --precompile src.cppm -o src.pcm
clang++ -std=c++20 src.pcm --precompile src.o

but the pcm files won't be depended directly. So it may be harder, I am not sure. @ben.boeckel any update?

In short - seems like we should separate out the cache discussion from the "one phase" compilation in the sense of a single build action that takes a .cppm and generates both a .o and a .pcm in one compiler/driver invocation. (maybe something like this is what @iains has already sent out in another review?)

Agreed.

to @iains point about "it'd be good if we didn't have to invoke two underlying commands from the one drivter invocation" - yeah, agreed. Though I wouldn't mind one step being "add the driver interface" and another being "fix whatever serialization isuse/etc/ might stand in the way of doing .cppm->{.o,.pcm} in a single action without serialization, so we can then start stripping stuff out of the .pcm since it'll only need to contain the interface, and not have to worry about having enough info for .o generation anymore"

(I guess you're saying without deserialization)

Agreed in the higher level.

But what do you mean about stripping stuff out of the .pcm? Do you mean to remove some function bodies when writing the BMI? If yes, it'll be problematic. When we did so, when other TU imports the BMI, it won't see the function bodies it could see before. This will prevent optimization. It'll be good as an optional flag so that the user know what the effect is. But it might not be good to do so by default. (Although this paragraph starts to talk about other topics)

  • Yes, all of this stuff is important (and, yes, we are drifting into other [related] topics) - we already have a mechanism for providing function bodies to optimisation [LTO] - I do not think we should want to make module interfaces larger than necessary to duplicate that functionality (the reason we do now is because all the information needs to be present to feed into codegen) - it has been a long-term objective (I think listed even in @rsmith 's modules TODO list) to remove the unneeded BMI content.
  • Actually, my point was meant to be quite simple and directly related to this patch that the choice of options name should be something that would be obvious to the end-user, ideally we should have related options named similarly and we should avoid using terms that are familiar to compiler engineers but not necessarily to the end user.

I am also OK with doing this in two steps (first in the driver with this patch and then by updating the FE to allow the two outputs from one invocation - my draft patch series).

BTW: I did mean to ask before .,, did you consider this (existing) command syntax?

-fmodule-file=[<name>=]<file>

and see if it works for your case? (it seems that it should to be consistent).

Yes, all of this stuff is important (and, yes, we are drifting into other [related] topics) - we already have a mechanism for providing function bodies to optimisation [LTO] - I do not think we should want to make module interfaces larger than necessary to duplicate that functionality (the reason we do now is because all the information needs to be present to feed into codegen) - it has been a long-term objective (I think listed even in @rsmith 's modules TODO list) to remove the unneeded BMI content.

I remembered the @rsmith's TODO list was about the the discarding entities in the GMF (I just found that I forgot to review your patches.). I am not sure if I mis remembered. And I agree it'll be great to reduce the size of module interfaces. I've heard the concern from Mathias that the size of BMI may affect the distributed building. Also in our internal practicing, we also use ThinLTO and we'll tuning ThinLTO for better performance. BUT from the general perspective of users, LTO (not ThinLTO) is not used widely. I only see people use it in cases where they need a score to show the performance or in some kernel cases. It is not common. Also from our simple experience, Dropping function bodies from module interface and ThinLTO will get worse performance than original ThinLTO. This is a performance regression too. I mean it'll be a drastic change and so it'll be a longer-term objective.

Actually, my point was meant to be quite simple and directly related to this patch that the choice of options name should be something that would be obvious to the end-user, ideally we should have related options named similarly and we should avoid using terms that are familiar to compiler engineers but not necessarily to the end user.

Yeah, agreed. But as I said, I originally uses the term module file. But @ruoso suggested the term BMI. So now in https://clang.llvm.org/docs/StandardCPlusPlusModules.html#built-module-interface-file, we uses the term BMI to end users and module-file is not a defined term now.. So it may be better to use BMI than module-file from the perspective.

I am also OK with doing this in two steps (first in the driver with this patch and then by updating the FE to allow the two outputs from one invocation - my draft patch series).

For your draft patches, I have only one concern in the high level: if we will/should restrict the module declarations appear in the filenames with special suffixes. Since both @rsmith and MSVC made the same decision. I am not sure if they have special reasons to do so. Maybe I need to search the discussion in SG15 mailing lists. And another related concern (maybe concern is not a good word here) is, if your patches landed, many existing codes may need to be removed. Otherwise, it'll be redundant codes. But this might not be a blocking issue though.

BTW: I did mean to ask before .,, did you consider this (existing) command syntax?

-fmodule-file=[<name>=]<file>

and see if it works for your case? (it seems that it should to be consistent).

Oh, do you mean we should use module-file name since we've used -fmodule-file option? Good point. Yeah, it looks like a pity to have 2 terms to describe the same thing. Personally I don't have strong feeling for the option name. If @dblaikie has no other comments on this, I'll follow your suggestion.

ruoso added a comment.Oct 18 2022, 1:07 AM

FWIW, having a differently named argument for output files and input files is very important for tooling.

In particular, in the case of remote execution, it makes a huge difference being able to tell what are the inputs and outputs by looking at the arguments alone, so I would like to make the case for an explicit output argument.

In fact, this is something I'll request on GCC as well, since it currently uses the module mapper to figure out both the input and the output BMI files.

FWIW, having a differently named argument for output files and input files is very important for tooling.

In particular, in the case of remote execution, it makes a huge difference being able to tell what are the inputs and outputs by looking at the arguments alone, so I would like to make the case for an explicit output argument.

In fact, this is something I'll request on GCC as well, since it currently uses the module mapper to figure out both the input and the output BMI files.

Is this a consensus of SG15?

ruoso added a comment.Oct 18 2022, 1:33 AM

Currently, no. I think we do need a paper to discuss the requirements of the remote execution protocol and how they relate to the implementation of C++ modules.

The simple summary is that one way that remote execution is implemented is by just wrapping the compiler execution, creating a Merkle tree with all the inputs, identify all outputs, ship that to a remote worker, and return another Merkle tree with the outputs.

Currently, no. I think we do need a paper to discuss the requirements of the remote execution protocol and how they relate to the implementation of C++ modules.

The simple summary is that one way that remote execution is implemented is by just wrapping the compiler execution, creating a Merkle tree with all the inputs, identify all outputs, ship that to a remote worker, and return another Merkle tree with the outputs.

Yeah, it is always better to have standard protocols. So the current state about the suffixes of modules (or a differently named argument for output files in your terms) is still need to be discussed. Personally, I agree with the special suffix for the sake of readability. @iains I think your draft patches may be better to be suspended until the SG15 get consensus. From my understanding, the problem may not be related with the client/server modes, right? So I guess this may not block your future works.

iains added a comment.Oct 18 2022, 2:32 AM

once again we are getting off topic for this patch :)

We (compiler engineers) will have to cater for multiple models used by different build systems.

SG15 might give guidance/recommendations, but in the end the standard's normative text is not likely to make a 'discovery-based' scheme like build2 non-conforming.
So, course, it is possible to make a GCC bugzilla requesting that it is possible to give a module output filename as a command line option (hopefully using the same option spelling as clang). Allowing both compilers to operate with each other's [C++20] command lines is a great objective ..

once again we are getting off topic for this patch :)

Yeah, let's wait for @dblaikie's opinion on the flag name : )

The remained question: To use -fc++-module-bmi-output= or -fc++-module-filename=, or anything else (maybe -fc++-module-output=)

I'm OK with sticking with the existing -fmodule-file if that works for everyone. Yeah, it's short and ambiguous in a space with many concepts of what a "module file" is, but also the fact that it's a -f flag might help disambiguate it a bit - it's probably not the way anyone would think/expect to be passing source files, those just get passed without flags on the command line. And any use of it will show the .pcm extension or whatever that should make it clear enough what's going on.

I'm OK with sticking with the existing -fmodule-file if that works for everyone. Yeah, it's short and ambiguous in a space with many concepts of what a "module file" is, but also the fact that it's a -f flag might help disambiguate it a bit - it's probably not the way anyone would think/expect to be passing source files, those just get passed without flags on the command line. And any use of it will show the .pcm extension or whatever that should make it clear enough what's going on.

hmm (I realise I mentioned this, and hope it has not complicated things) ..

.. I was thinking of the -fmodule-file=<name>=filename variant. The problem with using it without (the <name>) is that -fmodule-file= can (and does) appear multiple times on the command line to specify dependent modules - so that one would have to specify which (named) module was related to the filename.

In a pre-scanned world, the build system does know the info for each source file (published and dependent modules) [which ought to dispel some of the concerns raised about not knowing about possible outputs for implementation/interface cases].

In a discovery world, the interface to the build system carries all of this traffic anyway so that the command line would only be providing pre-set data for that.

.. and we do not care about header units in this discussion since they have to be handled specially anyway.

In a pre-scanned world, the build system does know the info for each source file (published and dependent modules) [which ought to dispel some of the concerns raised about not knowing about possible outputs for implementation/interface cases].

In a discovery world, the interface to the build system carries all of this traffic anyway so that the command line would only be providing pre-set data for that.

This does not cover all build systems. For example, Coverity relies on observing the compiler invocations performed by another build system and relies on interpreting the command lines of those invocations in order to identify all inputs and outputs (and Coverity does require awareness of outputs as well as inputs). Other tools that operate on a compilation database or monitors like Build EAR have similar requirements.

iains added a comment.Oct 19 2022, 2:59 PM

In a pre-scanned world, the build system does know the info for each source file (published and dependent modules) [which ought to dispel some of the concerns raised about not knowing about possible outputs for implementation/interface cases].

In a discovery world, the interface to the build system carries all of this traffic anyway so that the command line would only be providing pre-set data for that.

This does not cover all build systems. For example, Coverity relies on observing the compiler invocations performed by another build system and relies on interpreting the command lines of those invocations in order to identify all inputs and outputs (and Coverity does require awareness of outputs as well as inputs). Other tools that operate on a compilation database or monitors like Build EAR have similar requirements.

I did not intend to derail the discussion of how the command line option should be spelled - but for clarity - I continue to expect that we will need to support both styles of build system, and the user will need to choose what is appropriate for their workflow - nothing in the proposals here impacts on that, right?

I'm OK with sticking with the existing -fmodule-file if that works for everyone. Yeah, it's short and ambiguous in a space with many concepts of what a "module file" is, but also the fact that it's a -f flag might help disambiguate it a bit - it's probably not the way anyone would think/expect to be passing source files, those just get passed without flags on the command line. And any use of it will show the .pcm extension or whatever that should make it clear enough what's going on.

hmm (I realise I mentioned this, and hope it has not complicated things) ..

.. I was thinking of the -fmodule-file=<name>=filename variant. The problem with using it without (the <name>) is that -fmodule-file= can (and does) appear multiple times on the command line to specify dependent modules - so that one would have to specify which (named) module was related to the filename.

I'm really not following, sorry (which is a statement I could put on almost every discussion about modules build implications, to be honest - as much as it's something I've been involved with for years - so it's not your fault, as such, just that this stuff is hard, I think)

I /think/ currently -fmodule-file works with Clang header modules presumably because the module files contain within them enough information to enable the compiler to substitute the module for an include where necessary? I'm not sure the <name>= part is necessary, as such, if the BMIs can reasonably contain enough identifying info to make their use clear to the consumer.

iains added a comment.Oct 19 2022, 3:40 PM

I'm OK with sticking with the existing -fmodule-file if that works for everyone. Yeah, it's short and ambiguous in a space with many concepts of what a "module file" is, but also the fact that it's a -f flag might help disambiguate it a bit - it's probably not the way anyone would think/expect to be passing source files, those just get passed without flags on the command line. And any use of it will show the .pcm extension or whatever that should make it clear enough what's going on.

hmm (I realise I mentioned this, and hope it has not complicated things) ..

.. I was thinking of the -fmodule-file=<name>=filename variant. The problem with using it without (the <name>) is that -fmodule-file= can (and does) appear multiple times on the command line to specify dependent modules - so that one would have to specify which (named) module was related to the filename.

I /think/ currently -fmodule-file works with Clang header modules presumably because the module files contain within them enough information to enable the compiler to substitute the module for an include where necessary? I'm not sure the <name>= part is necessary, as such, if the BMIs can reasonably contain enough identifying info to make their use clear to the consumer.

-fmodule-file= works with both clang modules and standard C++ ones. What it does (as of now) is cause the FE to open the named file and to pre-load the module name - so that when the FE tries to find a module the lookup mechanism can associate the module name with the module data. So, if we need 3 dependent modules to be available when building a new source .. we would specify the PCM files containing those modules.

At present, if you try to specify a non-existent PCM using -fmodule-file= the compile will fail early on (it is expecting this to indicate a source module, not a destination one).

However, the second syntax -fmodule-file=<name>=filename I think should be able to work around this (since it says that the named module <name> is associated with the PCM filename which would allow us to cater for that file being missing (when we are creating it).

Does that clarify at all ?

It would be great not to add more modules options flags, there are already way to many :/ - but if this seems too complex then one of the spellings suggested by @ChuanqiXu would work.

I'm OK with sticking with the existing -fmodule-file if that works for everyone. Yeah, it's short and ambiguous in a space with many concepts of what a "module file" is, but also the fact that it's a -f flag might help disambiguate it a bit - it's probably not the way anyone would think/expect to be passing source files, those just get passed without flags on the command line. And any use of it will show the .pcm extension or whatever that should make it clear enough what's going on.

hmm (I realise I mentioned this, and hope it has not complicated things) ..

.. I was thinking of the -fmodule-file=<name>=filename variant. The problem with using it without (the <name>) is that -fmodule-file= can (and does) appear multiple times on the command line to specify dependent modules - so that one would have to specify which (named) module was related to the filename.

I /think/ currently -fmodule-file works with Clang header modules presumably because the module files contain within them enough information to enable the compiler to substitute the module for an include where necessary? I'm not sure the <name>= part is necessary, as such, if the BMIs can reasonably contain enough identifying info to make their use clear to the consumer.

-fmodule-file= works with both clang modules and standard C++ ones. What it does (as of now) is cause the FE to open the named file and to pre-load the module name - so that when the FE tries to find a module the lookup mechanism can associate the module name with the module data. So, if we need 3 dependent modules to be available when building a new source .. we would specify the PCM files containing those modules.

At present, if you try to specify a non-existent PCM using -fmodule-file= the compile will fail early on (it is expecting this to indicate a source module, not a destination one).

However, the second syntax -fmodule-file=<name>=filename I think should be able to work around this (since it says that the named module <name> is associated with the PCM filename which would allow us to cater for that file being missing (when we are creating it).

Would the intent be that this might write out multiple different module files? I would hope we don't need to support that - that we only generate the module file for the single source module passed to the compiler. I guess maybe to support clang++ x.cppm y.cppm you might have multiple output files? But in that case you don't get to use -o so maybe you don't get to specify the module paths either and just get the default "same as .o but with .pcm suffix" behavior?

Does that clarify at all ?

It would be great not to add more modules options flags, there are already way to many :/ - but if this seems too complex then one of the spellings suggested by @ChuanqiXu would work.

Oh, right, super sorry - this is about how to specify the output filename. OK.

(I'd still sort of lean towards "make it the same as the .o, but with the .pcm suffix" and if the build system wants to put things in other places, it can move them around - but I understand why that's not something everyone's on board with)

Yeah, if the -fmodule-file syntax is currently only for input I don't think we need to try to overload it for output as well.

Perhaps we could extend the --precompile flag to take a filename? (but then that'd potentially confuse things when you want to produce both PCM and .o, passing --precompile currently only produces the .pcm on the main output path... ).

Anyone got a list of other compiler flags that take output file names? (I guess profile generation, maybe some other flags do this? Might be worth comparing/contrasting/seeking inspiration from those? I guess -save-temps uses the "same path, different suffix" strategy)

iains added a comment.Oct 19 2022, 4:25 PM

I'm OK with sticking with the existing -fmodule-file if that works for everyone. Yeah, it's short and ambiguous in a space with many concepts of what a "module file" is, but also the fact that it's a -f flag might help disambiguate it a bit - it's probably not the way anyone would think/expect to be passing source files, those just get passed without flags on the command line. And any use of it will show the .pcm extension or whatever that should make it clear enough what's going on.

hmm (I realise I mentioned this, and hope it has not complicated things) ..

.. I was thinking of the -fmodule-file=<name>=filename variant. The problem with using it without (the <name>) is that -fmodule-file= can (and does) appear multiple times on the command line to specify dependent modules - so that one would have to specify which (named) module was related to the filename.

I /think/ currently -fmodule-file works with Clang header modules presumably because the module files contain within them enough information to enable the compiler to substitute the module for an include where necessary? I'm not sure the <name>= part is necessary, as such, if the BMIs can reasonably contain enough identifying info to make their use clear to the consumer.

-fmodule-file= works with both clang modules and standard C++ ones. What it does (as of now) is cause the FE to open the named file and to pre-load the module name - so that when the FE tries to find a module the lookup mechanism can associate the module name with the module data. So, if we need 3 dependent modules to be available when building a new source .. we would specify the PCM files containing those modules.

At present, if you try to specify a non-existent PCM using -fmodule-file= the compile will fail early on (it is expecting this to indicate a source module, not a destination one).

However, the second syntax -fmodule-file=<name>=filename I think should be able to work around this (since it says that the named module <name> is associated with the PCM filename which would allow us to cater for that file being missing (when we are creating it).

Would the intent be that this might write out multiple different module files? I would hope we don't need to support that - that we only generate the module file for the single source module passed to the compiler. I guess maybe to support clang++ x.cppm y.cppm you might have multiple output files? But in that case you don't get to use -o so maybe you don't get to specify the module paths either and just get the default "same as .o but with .pcm suffix" behavior?

Does that clarify at all ?

It would be great not to add more modules options flags, there are already way to many :/ - but if this seems too complex then one of the spellings suggested by @ChuanqiXu would work.

Oh, right, super sorry - this is about how to specify the output filename. OK.

(I'd still sort of lean towards "make it the same as the .o, but with the .pcm suffix" and if the build system wants to put things in other places, it can move them around - but I understand why that's not something everyone's on board with)

Actually, I would think that to be a great default (and put it in the CWD just like the object),
.. but it seems (in addition to the default) we do need a way to place specific module files - and I could imagine needing to make different places for different command line option sets. Asking the driver or build system to move the file seems like spawning more processes and we'd still need a user-visible way to say where we wanted the file to be put.

Yeah, if the -fmodule-file syntax is currently only for input I don't think we need to try to overload it for output as well.

My understanding [which could be flawed] is that the second syntax (which already exists, it's not an invention of this discussion) is intended to allow specification of output files.

Perhaps we could extend the --precompile flag to take a filename? (but then that'd potentially confuse things when you want to produce both PCM and .o, passing --precompile currently only produces the .pcm on the main output path... ).

yeah, --precompile is well-defined to produce only the module (and I have a patch to alias it to -fmodule-only to match the GCC equivalent), it would not help to conflate that with the current objective.

Anyone got a list of other compiler flags that take output file names? (I guess profile generation, maybe some other flags do this? Might be worth comparing/contrasting/seeking inspiration from those? I guess -save-temps uses the "same path, different suffix" strategy)

I'll see if I can find my master list of modules options (tomorrow .. erm later today).

Yeah, I am a little bit surprising/confusing to see why we're talking about the behavior of -fmodule-file. I guess the intention of @iains is to throw the option to prove -fc++-module-file-path (or -fc++-module-file-output) is the better. The point here is about the name instead of the behavior.

(I'd still sort of lean towards "make it the same as the .o, but with the .pcm suffix" and if the build system wants to put things in other places, it can move them around - but I understand why that's not something everyone's on board with)

Actually, I would think that to be a great default (and put it in the CWD just like the object),
.. but it seems (in addition to the default) we do need a way to place specific module files - and I could imagine needing to make different places for different command line option sets. Asking the driver or build system to move the file seems like spawning more processes and we'd still need a user-visible way to say where we wanted the file to be put.

For the default place, my original intention is to do not pollute (I know this may not be the best word) the user space. For example, when we compile a hello example (without modules):

clang++ hello.cpp -o hello

we won't see hello.o in the current directory. It lives in the /tmp directory. I guess the intuition is to avoid users see things that they can ignore. So I chose to set the default place to be the same with the clang modules (under ~/cache/clang/module...).

And for the needs to making different plans for different command line option sets, I feel like the current design (-fc++-modules-cache-path and -fc++-module-bmi-output (or -fc++-module-file-output)) is capable to handle it. For example, the different command line can set different -fc++-modules-cache-path or different -fc++-module-file-output.

Yeah, if the -fmodule-file syntax is currently only for input I don't think we need to try to overload it for output as well.

My understanding [which could be flawed] is that the second syntax (which already exists, it's not an invention of this discussion) is intended to allow specification of output files.

No. (If I don't misunderstand your point), the -fmodule-file is only about the inputs. The difference between -fmodule-file=a.pcm and -fmodule-file=A=a.pcm is about the lazy module loading. For the first syntax, the clang will try to load a.pcm at the start. But for the second one, the clang will only try to load a.pcm when the module A is required. And I can't see the relationship with the output files.

Perhaps we could extend the --precompile flag to take a filename? (but then that'd potentially confuse things when you want to produce both PCM and .o, passing --precompile currently only produces the .pcm on the main output path... ).

yeah, --precompile is well-defined to produce only the module (and I have a patch to alias it to -fmodule-only to match the GCC equivalent), it would not help to conflate that with the current objective.

Agreed. It may be not good to change the semantics of an existing option.

Anyone got a list of other compiler flags that take output file names? (I guess profile generation, maybe some other flags do this? Might be worth comparing/contrasting/seeking inspiration from those? I guess -save-temps uses the "same path, different suffix" strategy)

I'll see if I can find my master list of modules options (tomorrow .. erm later today).

I guess @dblaikie is talking about the general output flags (not limited to modules). I'll try to make a list today too.

I grepped options.td and got (incomplete) list for options to take a output name:

# -o and its alias
-o
-object_file_name=
--output=

/Fa (windows for assembly output filename)
/Fe (windows for output executable file name)
/Fi (windows for preprocessed output filename)
/Fo (Windows for object file)

-dependency-dot (for DOT-formatted header dependencies)
-dependency-file (to write dependency output to)

-header-include-file (Filename to write header include output to)

-opt-record-file (Filename to use for YAML optimization record output)

-split-dwarf-output (Filename to use for split dwarf debug info output)

-stack-usage-file (to write stack usage output to)
-coverage-data-file (Emit coverage data to this filename)
-coverage-notes-file (Emit coverage notes to this filename)

And it looks like the -file appears a lot. So may be the suggestion (-fc++-module-file-output) may be better. And for the default location, I feel like my explanation above makes sense. If the end user wants to produce .pcm files, they can use --precompile just like what they do with -c to get the object files. This only matters with end users since the build systems should/would chose other positions.

iains added a comment.EditedOct 19 2022, 11:38 PM

I grepped options.td and got (incomplete) list for options to take a output name:

# -o and its alias
-o
-object_file_name=
--output=

/Fa (windows for assembly output filename)
/Fe (windows for output executable file name)
/Fi (windows for preprocessed output filename)
/Fo (Windows for object file)

-dependency-dot (for DOT-formatted header dependencies)
-dependency-file (to write dependency output to)

-header-include-file (Filename to write header include output to)

-opt-record-file (Filename to use for YAML optimization record output)

-split-dwarf-output (Filename to use for split dwarf debug info output)

-stack-usage-file (to write stack usage output to)
-coverage-data-file (Emit coverage data to this filename)
-coverage-notes-file (Emit coverage notes to this filename)

And it looks like the -file appears a lot. So may be the suggestion (-fc++-module-file-output) may be better. And for the default location, I feel like my explanation above makes sense. If the end user wants to produce .pcm files, they can use --precompile just like what they do with -c to get the object files. This only matters with end users since the build systems should/would chose other positions.

OK. I guess the idea about -fmodule-file=<name>=filename was that, because the FE will not try to read filename (for module-generation cases) we could use it to describe the output file. However, it seems that might be too complex...

edit: Never mind, clearly a brainstorm on my part, that is obviously not workable in the case that there are multiple -fmodule-file=<name>=file name entries on the command line.

so -fc++-module-file-output seems OK to me.

ChuanqiXu updated this revision to Diff 469136.Oct 20 2022, 2:05 AM

Use -fc++-module-file-output to address the comments.

(I'd still sort of lean towards "make it the same as the .o, but with the .pcm suffix" and if the build system wants to put things in other places, it can move them around - but I understand why that's not something everyone's on board with)

Actually, I would think that to be a great default (and put it in the CWD just like the object),
.. but it seems (in addition to the default) we do need a way to place specific module files - and I could imagine needing to make different places for different command line option sets. Asking the driver or build system to move the file seems like spawning more processes and we'd still need a user-visible way to say where we wanted the file to be put.

At least I'd like these to be separate patches - adding ".cppm -> {.pcm + .o}" action and then separately adding a way to specify the name of the .pcm output file.

But I still have some reservations about even providing a flag for it - I think the example I was thinking of was coverage files ( https://gcc.gnu.org/onlinedocs/gcc/Gcov-Data-Files.html ) which also don't have a custom path option, like Split DWARF. I'd be happy not to add this until someone comes back with a demonstrated need - it doesn't seem like it'd be a major imposition on a build system to copy the file if it needed to to move it into a release tree V the build tree, for instance. And if it did add up to real performance problems, we could add it.

Yeah, if the -fmodule-file syntax is currently only for input I don't think we need to try to overload it for output as well.

My understanding [which could be flawed] is that the second syntax (which already exists, it's not an invention of this discussion) is intended to allow specification of output files.

I haven't seen evidence of that, but it's certainly not something I've got lots of info on. It looks like it was added in https://reviews.llvm.org/D35020 as a search path.

I think the first patch (be it using this review or another) should be just the ".cppm -> {.pcm + .o}" (which should basically be entirely in the driver, right? Since for now it'll still be two actions that are already implemented) and then another for "here's a flag to name the .pcm output file" (not sure if we need that, really, just yet) and possibly another for "here's an option to write the .pcm file out to a cache directory instead of the .o path or custom .pcm output path" (I've even stronger reservations about that). I guess discussing the second and third somewhat together, if their flag interface might interact (since they both specify where the .pcm file goes).

ChuanqiXu added a comment.EditedOct 20 2022, 11:42 PM

@dblaikie I am confusing about your concern. For the test coverage example, on the one hand, it is completely different from the case the coverage report was generated in the runtime instead of the compile time. On the other hand, it also provides a -fprofile-dir option to control the output directory.

I'd be happy not to add this until someone comes back with a demonstrated need

Yeah, we have demos. Both https://github.com/ChuanqiXu9/llvm-project/tree/MyP1689/P1689Examples and https://reviews.llvm.org/D135507 are demos for this.

it doesn't seem like it'd be a major imposition on a build system to copy the file if it needed to to move it into a release tree V the build tree, for instance. And if it did add up to real performance problems, we could add it.

It doesn't make sense. Yes, the build system can provide the same behavior by mv commands. But why should we put the effort to the build systems? Especially it won't bring any cost for us to generate the pcm files in the specified positions except an additional flags. And it will be an overhead to move the pcm files at least now. For example, for the following simple module unit:

module;
#include <iostream>
export module Hello;
export void HelloWorld() {
    std::cout << "Hello World.\n";
}

the size of the generated pcm file is 4.3M. I know we can say that it is bad the compiler generate a so big pcm files. And it will be great if we can reduce them. However, this the status quo.

then another for "here's a flag to name the .pcm output file" (not sure if we need that, really, just yet)

It is needed for the build systems to specify the position of pcm output files. In fact, this option is required by @ben.boeckel

possibly another for "here's an option to write the .pcm file out to a cache directory instead of the .o path or custom .pcm output path" (I've even stronger reservations about that).

To be honest, at least for the current experiments with kitware guys, this is not necessary. However, I think this is helpful for people to (try to) use modules in their current build systems. We know this is not so good. But it is much better than now.


After thinking about this more, I found the root divergence between us is about the principle/bar to add a compilation flag sugar (not a functionality but a flag). I feel like your standard for this patch is my standard for the new functionality instead of the compilation flag sugar.

What is a compilation flag sugar to me? My answer is: if we can get the same result without this flag, then this flag is a a compilation flag sugar to me. Then whole of this patch is a compilation sugar to me. How can we get the same result without this patch? For example, for the simple Hello.cppm, we can do the following steps:

clang++ -std=c++20 Hello.cppm -save-temps -c

then we can see the following files in the current directory:

Hello.bc  Hello.cppm  Hello.iim  Hello.o  Hello.pcm  Hello.s

then we can remove all the unwanted things, we can get:

Hello.cppm Hello.pcm

Now we can move the Hello.pcm to the so called cache directory. Then we get the exactly the same result without this patch! This is the reason why I feel this patch is innocent. Since it only reduces the above steps into two flags. And if we take the necessity as the standard for this patch. Then whole of this patch is meaningless. Since the user can still get the results by some shell tricks.

But I think it should be good to do such things for our uses. We shouldn't push such burdens to our users if we can. And it shouldn't bring a burden to us to introduce such two flags.

I'm getting a bit exhausted with all the words involved here & not sure how to simplify/clarify this.

If @ben.boeckel has particular use cases, it might be easier for him to be here discussing them so we can discuss the tradeoffs directly rather than through intermediaries.

I think the choices of flags, even when they represent relatively minor work on the compiler side, are important in terms of how they shape the environment the compiler exists in. I have reservations about implementing the libCody and the scanner-based solutions (let alone also caching based solutions) - but that ship's probably already sailed in terms of it's implemented in GCC and build2 is using it. (sort of like open source software - we implement things for compatibility (like LGPL) but when we're the ones innovating/creating new things we can and should be more cautious/possibly more prescriptive to avoid creating more diversity/divergence than is necessary)

Please separate this work into isolated patches & we can discuss them separately. I think this review might be best to abandon as the subject line/description's out of synch and there's been a /lot/ of discussion going in a lot of directions such that it'd be hard to understand the conclusions/focus of this review at this point.

ChuanqiXu abandoned this revision.Oct 23 2022, 7:38 PM

I'm getting a bit exhausted with all the words involved here & not sure how to simplify/clarify this.

If @ben.boeckel has particular use cases, it might be easier for him to be here discussing them so we can discuss the tradeoffs directly rather than through intermediaries.

Agreed.

I think the choices of flags, even when they represent relatively minor work on the compiler side, are important in terms of how they shape the environment the compiler exists in. I have reservations about implementing the libCody and the scanner-based solutions (let alone also caching based solutions) - but that ship's probably already sailed in terms of it's implemented in GCC and build2 is using it. (sort of like open source software - we implement things for compatibility (like LGPL) but when we're the ones innovating/creating new things we can and should be more cautious/possibly more prescriptive to avoid creating more diversity/divergence than is necessary)

Please separate this work into isolated patches & we can discuss them separately. I think this review might be best to abandon as the subject line/description's out of synch and there's been a /lot/ of discussion going in a lot of directions such that it'd be hard to understand the conclusions/focus of this review at this point.

Yeah, I agree this thread is complex enough. I'll try to split the patches.

I'm getting a bit exhausted with all the words involved here & not sure how to simplify/clarify this.

If @ben.boeckel has particular use cases, it might be easier for him to be here discussing them so we can discuss the tradeoffs directly rather than through intermediaries.

Sorry, I've been lax on actually keeping up-to-date on all of these Clang threads.

The current state is that CMake's test suite for C++ modules does not work with clang because it doesn't provide the control CMake needs to do its explicit builds. @ChuanqiXu's MyP1689 branch is close except that the .pcm output path can't seem to be controlled closely enough for reliable builds. Namely, I would like:

  • the ability to disable the module cache (completely); CMake doesn't need nor want it
  • the ability to say put the generated .pcm that this TU will generate goes *here*

The latter is currently a juggle of flags and restrictions that I haven't been able to figure out. Without something like -fmodule-output-path=, convincing Clang to output its .pcm file to where CMake wants it to be is some combination of -fmodules-cache-path= and -fmodule-name= while hoping that the internal path computation inside of Clang makes what CMake wants. It's just far easier to have something like -o that gives the answer straight away.

For the former, I use -x c++module as well so that I don't have to care about any extension-sniffing behaviors. I also need to give a bogus -fmodule-cache-path= for module consumers because as soon as import is seen, it wants to run off to look at the cache instead of noticing that there's a -fmodule-file= for everything it needs already and the cache is 100% unused.

I'm getting a bit exhausted with all the words involved here & not sure how to simplify/clarify this.

If @ben.boeckel has particular use cases, it might be easier for him to be here discussing them so we can discuss the tradeoffs directly rather than through intermediaries.

Sorry, I've been lax on actually keeping up-to-date on all of these Clang threads.

The current state is that CMake's test suite for C++ modules does not work with clang because it doesn't provide the control CMake needs to do its explicit builds. @ChuanqiXu's MyP1689 branch is close except that the .pcm output path can't seem to be controlled closely enough for reliable builds. Namely, I would like:

  • the ability to disable the module cache (completely); CMake doesn't need nor want it
  • the ability to say put the generated .pcm that this TU will generate goes *here*

The latter is currently a juggle of flags and restrictions that I haven't been able to figure out. Without something like -fmodule-output-path=, convincing Clang to output its .pcm file to where CMake wants it to be is some combination of -fmodules-cache-path= and -fmodule-name= while hoping that the internal path computation inside of Clang makes what CMake wants. It's just far easier to have something like -o that gives the answer straight away.

For the former, I use -x c++module as well so that I don't have to care about any extension-sniffing behaviors. I also need to give a bogus -fmodule-cache-path= for module consumers because as soon as import is seen, it wants to run off to look at the cache instead of noticing that there's a -fmodule-file= for everything it needs already and the cache is 100% unused.

BTW, the bogus -fmodule-cache-path= for module consumers can be addressed by the existing -fprebuilt-module-path. And I sent the new revisions in D137058 and D137059.

I tried applying this patch to your MyP1689 branch (fixing conflicts with attempts there), but it seems that something isn't being plumbed properly:

clang-15: error: unknown argument: '-fc++-module-file-output=CMakeFiles/export_bmi_and_interfaces.dir/importable.pcm'
clang-15: error: unable to create default module cache path "/home/boeckb/.cache/clang/ModuleCache": No such file or direc
tory

There's no way to disable the "module cache path" either (it fails here because I symlinked it to /dev/null to avoid things "working" behind my back).

I tried applying this patch to your MyP1689 branch (fixing conflicts with attempts there), but it seems that something isn't being plumbed properly:

clang-15: error: unknown argument: '-fc++-module-file-output=CMakeFiles/export_bmi_and_interfaces.dir/importable.pcm'
clang-15: error: unable to create default module cache path "/home/boeckb/.cache/clang/ModuleCache": No such file or direc
tory

There's no way to disable the "module cache path" either (it fails here because I symlinked it to /dev/null to avoid things "working" behind my back).

Oh, it is odd. The option -fc++-module-file-output should have higher priority than creating module cache path.

BTW, this patch should be discarded. And now we're pursuing D137059 and D137058, where there is no modules cache. I know it is a little bit frustrating to change the demo all the time... but let's turn for that direction now.

BTW, this patch should be discarded. And now we're pursuing D137059 and D137058, where there is no modules cache. I know it is a little bit frustrating to change the demo all the time... but let's turn for that direction now.

Sorry, Phab is obtuse/unfamiliar for me right now; the "Abandoned" state isn't prevalent enough and all the pages look the same…

FWIW, I was able to make CMake's module test suite pass with this patch on top of you MyP1689 branch on your Github fork. I also added some diffs to help clean up output files. I'll try and get it to work with the replacement patches as well, but I just want this to be a working checkpoint.