This is an archive of the discontinued LLVM Phabricator instance.

[Modules] Add a command line option for enabling the modules feature exclusively for the Clang builtins.
Needs ReviewPublic

Authored by eladcohen on Oct 6 2016, 11:30 AM.

Details

Summary

-fexclusive-builtin-modules enables the clang 'modules' feature exclusively for the clang intrinsic header files.

The end goal of this effort is to have this option on by default for x86 targets so we could reduce the long compile time of the x86 intrinsic header files.

Diff Detail

Event Timeline

eladcohen updated this revision to Diff 73825.Oct 6 2016, 11:30 AM
eladcohen retitled this revision from to [Modules] Add a command line option for enabling the modules feature exclusively for the Clang builtins..
eladcohen updated this object.
eladcohen added reviewers: hans, rnk, zvi, rsmith, chandlerc.
eladcohen added a subscriber: cfe-commits.
bruno added a reviewer: bruno.Oct 6 2016, 1:30 PM
bruno added a subscriber: bruno.

Hi Elad,

Is there any reason why you can't explicit model this in your build system by pre-building the intrinsics and pointing to a module cache path containing the pcm files? By doing that we don't need to have a specific compile flag.

Hi Bruno,

The short answer is yes. Essentially it can be done, but we would actually like for all the users to always get this behavior, implicitly.

The long answer is that there is a history of problems regarding the intrinsic files:
http://lists.llvm.org/pipermail/cfe-dev/2016-May/048837.html
http://lists.llvm.org/pipermail/cfe-dev/2016-September/050943.html
Mainly compatibility issues because MSVC makes all the intrinsics available all the time, while in clang this is not always the case (On Windows the different h files are not-included unless asked for explicitly since AVX512 was added which impacted compile-time).
A suggestion was made to try and mitigate this by reducing the compile time using modules (only for the x86 intrinsics). This patch aims at adding this option. The new compile flag is just a milestone - if all goes well, we can turn this on by default (Then we won't actually need to use a specific compile flag, and we could always include all the intrinsic h files without their long compile time).

bruno edited edge metadata.Oct 17 2016, 11:19 AM

The long answer is that there is a history of problems regarding the intrinsic files:
http://lists.llvm.org/pipermail/cfe-dev/2016-May/048837.html
http://lists.llvm.org/pipermail/cfe-dev/2016-September/050943.html
Mainly compatibility issues because MSVC makes all the intrinsics available all the time, while in clang this is not always the case (On Windows the different h files are not-included unless asked for explicitly since AVX512 was added which impacted compile-time).

Thinking a bit more about this: did you ever consider a solution where "immintrin.h" (or any similar other) would use "#include_next" and "__has_include_next" to point to a second level (user specified) immintrin.h, which then could use ifdefs to only include what's relevant?

A suggestion was made to try and mitigate this by reducing the compile time using modules (only for the x86 intrinsics). This patch aims at adding this option. The new compile flag is just a milestone - if all goes well, we can turn this on by default (Then we won't actually need to use a specific compile flag, and we could always include all the intrinsic h files without their long compile time).

The speedup sounds nice, but it seems awkward IMO to have such specific behavior for x86 intrinsics only.

rsmith edited edge metadata.Oct 17 2016, 11:38 AM

I really don't like the command-line interface you're proposing here. It seems like it will be extremely confusing what something like -fmodules -fexclusive-builtin-modules means, for instance (usually, later -f flags override earlier ones, so does this *turn off* modules for everything other than builtin headers?). Instead, we should use a command-line interface that more naturally / obviously composes with other flags.

It seems like the behavior you want here is precisely:

`-fmodules -fno-implicit-module-maps -fmodule-map-file=<resource dir>/include/module.modulemap`

(The -fmodules-validate-system-headers part would only make sense for people using clang directly from their build area; installed / released versions of clang should not need this. Adding it makes your flag compose poorly with other uses of modules.)

I'd be happy with you adding a flag that is equivalent to -fmodule-map-file=<resource dir>/include/module.modulemap.

Also, it seems unlikely to me that we would ever turn this on by default. It will cause havoc for distributed builds, especially ones that run compilation actions in 'clean' environments, as (for instance) it will cause the complete set of intrinsics headers to be compiled into a module on every compile. So if this is your proposed solution for slow compiles using intrinsics headers, you should be aware that this is not a complete solution to that problem.

The long answer is that there is a history of problems regarding the intrinsic files:
http://lists.llvm.org/pipermail/cfe-dev/2016-May/048837.html
http://lists.llvm.org/pipermail/cfe-dev/2016-September/050943.html
Mainly compatibility issues because MSVC makes all the intrinsics available all the time, while in clang this is not always the case (On Windows the different h files are not-included unless asked for explicitly since AVX512 was added which impacted compile-time).

Thinking a bit more about this: did you ever consider a solution where "immintrin.h" (or any similar other) would use "#include_next" and "__has_include_next" to point to a second level (user specified) immintrin.h, which then could use ifdefs to only include what's relevant?

I don't think we really want the users to have to maintain an updated user specific version of immintrin.h. Moreover, I think that at the end we actually do want clang to always include all the intrinsics same as MSVC.

A suggestion was made to try and mitigate this by reducing the compile time using modules (only for the x86 intrinsics). This patch aims at adding this option. The new compile flag is just a milestone - if all goes well, we can turn this on by default (Then we won't actually need to use a specific compile flag, and we could always include all the intrinsic h files without their long compile time).

The speedup sounds nice, but it seems awkward IMO to have such specific behavior for x86 intrinsics only.

The feature itself is not x86 specific. It can be used for all the clang builtins. As for enabling it by default, I did have only x86 in mind but it can be done for all targets as well. (will probably require checking that all the other intrinsic headers are modular and that the modulemap is up to date).

I really don't like the command-line interface you're proposing here. It seems like it will be extremely confusing what something like -fmodules -fexclusive-builtin-modules means, for instance (usually, later -f flags override earlier ones, so does this *turn off* modules for everything other than builtin headers?). Instead, we should use a command-line interface that more naturally / obviously composes with other flags.

It seems like the behavior you want here is precisely:

`-fmodules -fno-implicit-module-maps -fmodule-map-file=<resource dir>/include/module.modulemap`

(The -fmodules-validate-system-headers part would only make sense for people using clang directly from their build area; installed / released versions of clang should not need this. Adding it makes your flag compose poorly with other uses of modules.)

I'd be happy with you adding a flag that is equivalent to -fmodule-map-file=<resource dir>/include/module.modulemap.

You are right, this is the behavior I was aiming at. I just hoped I could get a single flag to achieve it instead of 3, but I agree that it doesn't behave well with the other module flags.
I uploaded the new flag in a separate review: https://reviews.llvm.org/D25767

Also, it seems unlikely to me that we would ever turn this on by default. It will cause havoc for distributed builds, especially ones that run compilation actions in 'clean' environments, as (for instance) it will cause the complete set of intrinsics headers to be compiled into a module on every compile. So if this is your proposed solution for slow compiles using intrinsics headers, you should be aware that this is not a complete solution to that problem.

What do you mean by 'clean' environments? As in no intermediate artifacts (such as the .pcm files) could be cached?

Do you have any thoughts or suggestions on how we can make the solution more complete?

I guess one possibility to handle this matter (IIRC mentioned by Chandler in one of the discussions) is to pre-compile the actual builtins module as part of the build system. But IIUC this is not feasible since there could be a huge matrix of all the different configurations we will have to maintain, right?

bruno resigned from this revision.Nov 9 2020, 11:54 AM