This is an archive of the discontinued LLVM Phabricator instance.

[Modules] Add a flag to control builtin headers being in system modules
ClosedPublic

Authored by iana on Sep 7 2023, 11:54 AM.

Details

Summary

Including select builtin headers in system modules is a workaround for module cycles, primarily in Apple's Darwin module that includes all of its C standard library headers. The workaround is problematic because it doesn't include all of the builtin headers (inttypes.h is notably absent), and it also doesn't include C++ headers. The straightforward for for this is to make top level modules for all of the C standard library headers and unwind.h in C++, clang, and the OS.

However, doing so in clang before the OS modules are ready re-introduces the module cycles. Add a -fbuiltin-headers-in-system-modules option to control if the special builtin headers belong to system modules or builtin modules. Pass the option by default for Apple.

Diff Detail

Event Timeline

iana created this revision.Sep 7 2023, 11:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 7 2023, 11:54 AM
Herald added a subscriber: ributzka. · View Herald Transcript
iana requested review of this revision.Sep 7 2023, 11:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 7 2023, 11:54 AM
iana added a comment.EditedSep 7 2023, 12:13 PM

A big assumption this patch makes is that ModuleMap::isBuiltinHeader is primarily to support Apple's unfortunate module needs. Thus this patch turns that behavior off by default, which makes things work the way one would expect. That is, when usr/include/module.modulemap references stdint.h, that just means usr/include/stdint.h and it doesn't also pull in the clang builtin stdint.h, it doesn't transform usr/include/stdint.h into a textual header, etc. I'm hoping that's acceptable behavior on non-Apple platforms, but if someone knows otherwise please let me know and we can rethink how the option should be defined and set.

Also we need the behavior switch to be a run time thing and not a compile time thing because Apple needs the same version of clang to support both current and old Apple SDKs. Ideally we'll fix newer SDKs to not require the flag, but we'll still need it passed for older SDKs.

but need to be repeatedly included when they're used in system modules

How does this work today? Wouldn't the include guard prevent this?

clang/include/clang/Basic/Features.def
233

This appears to be in a list of // C++ TSes. Near the bottom there is 'Miscellaneous language extensions' which might be a better fit.

clang/lib/Driver/ToolChains/Darwin.cpp
2848

I would have expected the default to be true; do old SDKs not have SDKInfo or something?

2853

Are these really supposed to be 100 or is this a placeholder?

iana added inline comments.Sep 7 2023, 1:28 PM
clang/include/clang/Driver/Options.td
2955

I don't love this flag name, but I couldn't come up with anything more succinct. It actually does two things.

  1. Turns on ModuleMap::isBuiltinHeader behavior, that is the builtin headers get added into the system modules.
  2. Causes the module map parser to ignore all of the new builtin modules added in D159064. This is needed before the modules are added to assist with Apple internal staging with the Swift compiler.
2956–2957

I don't fully understand what these two do, but I think they apply here?

clang/lib/Headers/__stddef_ptrdiff_t.h
10 ↗(On Diff #556191)

This is to support patterns like the Modules/stddef.c test. The module map looks like this.

module StdDef {
  module Other {
    header "other.h" // includes stddef.h
  }
  module PtrDiffT {
    header "ptrdiff_t.h" // defines __needs_ptrdiff_t and includes stddef.h
  }
  module IncludeAgain {
    header "include_again.h" // includes stddef.h
  }
}

With builtin_headers_in_system_modules, __stddef_ptrdiff_t.h will not be in a module. It will first be seen in other.h and define its header guard. When it's seen again in ptrdiff_t.h, it needs to declare ptrdiff_t again, so it needs to ignore its header guard.

Without builtin_headers_in_system_modules, __stddef_ptrdiff_t.h will be in the _Builtin_stddef module. It needs its header guard there so that it's only included once when building the module. Even though ptrdiff_t.h will see the header guard, it will still import _Builtin_stddef.ptrdiff_t and get the declaration.

clang/lib/Headers/stddef.h
10 ↗(On Diff #556191)

As a textual header, this needs to ignore its header guard in module mode, even without builtin_headers_in_system_modules. Going back to the Modules/stddef.c test.

module StdDef {
  module Other {
    header "other.h" // includes stddef.h
    export *
  }
  module PtrDiffT {
    header "ptrdiff_t.h" // defines __needs_ptrdiff_t and includes stddef.h
  }
  module IncludeAgain {
    header "include_again.h" // includes stddef.h
    export *
  }
}

When other.h compiles, STDDEF_H will get defined and _Builtin_stddef.ptrdiff_t will be imported. But when include_again.h compiles, STDDEF_H will already be defined and _Builtin_stddef.ptrdiff_t won't be imported because the include will be skipped in stddef.h. That's ok when include_again.h is building because it will see ptrdiff_t from when other.h included it. But a client that only includes include_again.h won't see any types because StdDef.IncludeAgain skipped the contents of stddef.h and skipped importing _Builtin_stddef.ptrdiff_t, and so there's nothing for export * to export.

The other solution is to never define __STDDEF_H in module mode, but I think that's strange behavior. Several headers in the Apple SDKs test if a header was included by checking for its header guard (questionable, but it's what they do). stddef.h isn't one of those headers, but it still seems like it should define everything in modules mode that it defines without modules.

clang/lib/Lex/ModuleMap.cpp
259

This had to be moved up just because it's used by methods under here.

1540

This an everything under it is gross, but the only other thing I could think of was to duplicate the module map and decided to load a special one for BuiltinHeadersInSystemModules. That seemed weirder.

clang/test/Modules/Werror-Wsystem-headers.m
6–8

Most of these tests aren't really affected by -fbuiltin-headers-in-system-modules, they just coincidentally use Inputs/System/usr/include and that needs -fbuiltin-headers-in-system-modules.

clang/test/Modules/shadowed-submodule.m
2

This could probably be redone to not use stdarg.h as the shadow and then it wouldn't require -fbuiltin-headers-in-system-modules, but it's probably ok as-is?

clang/test/Modules/stddef.c
2

This one and the next one will be updated in the next review to test without -fbuiltin-headers-in-system-modules too.

iana edited the summary of this revision. (Show Details)Sep 7 2023, 1:28 PM

but need to be repeatedly included when they're used in system modules

How does this work today? Wouldn't the include guard prevent this?

Today they don't define their include guard when building with modules.

iana marked an inline comment as done.Sep 7 2023, 1:31 PM
iana added inline comments.
clang/include/clang/Basic/Features.def
233

I thought it went with FEATURE(modules, LangOpts.Modules), but I can put it down in 'Miscellaneous language extensions' if that's better.

clang/lib/Driver/ToolChains/Darwin.cpp
2848

I'm not sure if we ever expect there to not be SDKInfo, but I thought it safer to default to the current behavior which is false.

2853

Placeholder until we actually update the SDKs.

How does this work today? Wouldn't the include guard prevent this?

Today they don't define their include guard when building with modules.

Thanks - I can see some headers behave that way, or in other cases there are other sorts of has_feature(modules) checks that I can see you're modifying, but what about all the *va_* headers that currently just seem to have normal header guards?

iana marked an inline comment as done.Sep 7 2023, 1:46 PM

How does this work today? Wouldn't the include guard prevent this?

Today they don't define their include guard when building with modules.

Thanks - I can see some headers behave that way, or in other cases there are other sorts of has_feature(modules) checks that I can see you're modifying, but what about all the *va_* headers that currently just seem to have normal header guards?

I think the stdarg ones didn't always work properly in modules and we just got lucky.

iana updated this revision to Diff 556206.Sep 7 2023, 2:16 PM

Fix formatting

iana updated this revision to Diff 556207.Sep 7 2023, 2:17 PM

Rebase

I suggest we add a comment explaining the weird has_feature checks being added here (even if they're less weird than what they're replacing). Maybe just a comment in stdarg.h and/or stddef.h and then the __std(arg|def) headers could just add a one-liner reference "see comment in std(arg|def).h".

clang/include/clang/Basic/Features.def
233

I think misc is probably better.

clang/include/clang/Driver/Options.td
2955

I think the name is good enough for a -cc1 option, unless someone else has a better suggestion. I suggest adding a HelpText - you could basically copy the description you added to LangOptions.def

2956–2957

I agree they apply. f_Group affects a ClaimAllArgs call (via inheritance from CompileOnly_Group) but mostly puts it under a group in documentation. NoXarchOption means you can't modify this option per-arch in a multi-arch compilation. That seems like what you want.

clang/lib/Lex/ModuleMap.cpp
1540

Would it be possible to get the right semantics using a requires decl in the modules plus a new module feature?

iana marked an inline comment as done.Sep 7 2023, 2:57 PM
iana added inline comments.
clang/include/clang/Basic/Features.def
233

👍

clang/include/clang/Driver/Options.td
2955

Sure

clang/lib/Lex/ModuleMap.cpp
1540

Maybe? I guess e.g. _Builtin_stdint would be seen, the new requires flag would be seen, and if it's not satisfied then its headers wouldn't be added to the module? That seems like a really weird use of requires though.

iana updated this revision to Diff 556230.Sep 7 2023, 11:17 PM
iana marked 3 inline comments as done.

Add comments to stdarg.h and stddef.h explaining their interesting header guard setup.
Ignore -fbuiltin-headers-in-system-modules if -fmodules isn't passed.
DriverKit doesn't need to pass -fbuiltin-headers-in-system-modules since it doesn't have a module map at all.
Add a test for -fbuiltin-headers-in-system-modules/__has_feature(builtin_headers_in_system_modules)
Miscellaneous review feedback

benlangmuir accepted this revision.Sep 8 2023, 8:39 AM
This revision is now accepted and ready to land.Sep 8 2023, 8:39 AM
iana updated this revision to Diff 556627.Sep 12 2023, 11:48 PM

__stddef_max_align_t.h always needs a header guard, because the type merger can't handle struct's. If it has a header guard then it always needs to have a module too, even if -fbuiltin-headers-in-system-modules is passed.

Change ModuleMap to not ignore the builtin modules completely, but rather to just skip their top level headers (which is isBuiltinHeaderName + inttypes.h and stdnoreturn.h). With that change, the implementation headers are always modular, and __has_feature(builtin_headers_in_system_modules) is no longer necessary anywhere.

iana requested review of this revision.Sep 13 2023, 12:07 AM
iana edited the summary of this revision. (Show Details)

A big assumption this patch makes is that ModuleMap::isBuiltinHeader is primarily to support Apple's unfortunate module needs. Thus this patch turns that behavior off by default, which makes things work the way one would expect. That is, when usr/include/module.modulemap references stdint.h, that just means usr/include/stdint.h and it doesn't also pull in the clang builtin stdint.h, it doesn't transform usr/include/stdint.h into a textual header, etc. I'm hoping that's acceptable behavior on non-Apple platforms, but if someone knows otherwise please let me know and we can rethink how the option should be defined and set.

Good way to test the assumptions can be checking how Swift works with this change on Linux. There are not as many module maps there as in Apple SDKs but there are some.

clang/include/clang/Basic/LangOptions.def
176

Why not to make the default value true to preserve the old behavior and switch on the new behavior in the driver? It's not a veiled way to force you to change it, I'm genuinely curious and want to consider pro/cons of various alternatives.

clang/lib/Driver/ToolChains/Darwin.cpp
2860–2861

Another question regarding defaults. Doesn't look consistent with your position

[...] I thought it safer to default to the current behavior which is false.

Personally I gravitate towards false for extra safety but it's not a strong opinion.

iana marked an inline comment as done.Sep 18 2023, 10:08 PM
iana added inline comments.
clang/include/clang/Basic/LangOptions.def
176

I did it this way because the current behavior is nonintuitive and just kind of bizarre. I had no idea that Darwin.C.stdint also included the compiler's stdint.h, and that the OS's stdint.h was treated textually. I thought it would be better for the default behavior to be for modules to behave consistently and obviously, not have secret special cases, and be usable with C++.

The con of course is that if anyone besides Apple depends on the current behavior they're going to either have to fix their OS modules or add this flag. I'm not sure if other platforms have a convenient driver class where they can do that.

clang/lib/Driver/ToolChains/Darwin.cpp
2860–2861

default only covers DriverKit, which can be true. Any future platforms should also default to true.

vsapsai added inline comments.Sep 19 2023, 6:05 PM
clang/include/clang/Basic/LangOptions.def
176

I'm fine with the more aggressive approach as long as we test Swift on Linux. That's the only project I know that is willing to keep using clang modules. For other projects if things break and it is a motivation to move to C++ modules, it is a positive direction, as for me.

Another reason I wanted to be more cautious is because of https://discourse.llvm.org/t/configure-script-breakage-with-the-new-werror-implicit-function-declaration/65213 change. Clang started enforcing 20+ years old rule and that has broken some stuff. So I don't see "you should've known better before relying on this behavior" as a good argument.

Anyway, for Swift testing here is what you can do (if you have your own plan, go ahead):

  1. Make PR for the same change as this one at https://github.com/apple/llvm-project/
  2. Create inconsequential PR at https://github.com/apple/swift/
  3. Use Cross Repository Testing to test clang change from swift on Linux.
clang/lib/Driver/ToolChains/Darwin.cpp
2860–2861

Ok, that looks like a reasonable choice.

iana marked 2 inline comments as done.Sep 19 2023, 9:49 PM
iana added inline comments.
clang/include/clang/Basic/LangOptions.def
176

If things break I think we would just add the flag to a few more drivers and fix it like that. I'm definitely not trying to drive anyone away from clang modules, I'm just hoping they don't need the strange current behavior. If it turns out to be too much of a mess then it can be on by default and we'll do an fno flag instead that we can turn off in the Darwin driver when we're ready.

Bigcheese accepted this revision.Sep 26 2023, 5:32 PM

The implementation of this change looks good. I share the concern that changing the default may break other users, but that's easily remedied and this is the correct default. It would be good to hear from at least the Google folks before landing though.

This revision is now accepted and ready to land.Sep 26 2023, 5:32 PM
ChuanqiXu accepted this revision.Sep 26 2023, 6:46 PM

@sammccall @rsmith - figure if this does impact us, we'll use the flag to opt-out in the short term while we figure out longer-term solution, yeah? Is there any pre-emptive testing you think is worth doing?

mib added a subscriber: mib.Sep 29 2023, 9:13 AM

@iana I believe this is affecting lldb macOS bot: https://green.lab.llvm.org/green/view/LLDB/job/as-lldb-cmake/6316/

Can you take a look ? Thanks!