This is an archive of the discontinued LLVM Phabricator instance.

[Clang][cc1] Make -fno-modules-local-submodule-visibility the default for ObjC++20
AcceptedPublic

Authored by Bigcheese on Jan 10 2023, 5:19 PM.

Details

Summary

Currently there is no way to have Objective-C++20 without LSV. This is
a problem because some existing ObjectiveC code is not compatible with
LSV. This patch provides a way to disable LSV even in C++20 mode and
makes it the default for ObjC++20.

Diff Detail

Event Timeline

Bigcheese created this revision.Jan 10 2023, 5:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 10 2023, 5:19 PM
Herald added a subscriber: ributzka. · View Herald Transcript
Bigcheese requested review of this revision.Jan 10 2023, 5:19 PM
This revision is now accepted and ready to land.Jan 10 2023, 6:06 PM

This is a problem because some existing ObjectiveC code is not compatible with LSV

Hmm, how is that true? Does this code only build with Clang Header Modules enabled, and can't build without that? (if it could build without it, I don't know why it'd need LSV, if I'm understanding all these things correctly... )

This is a problem because some existing ObjectiveC code is not compatible with LSV

Hmm, how is that true? Does this code only build with Clang Header Modules enabled, and can't build without that? (if it could build without it, I don't know why it'd need LSV, if I'm understanding all these things correctly... )

It only builds without modules or without LSV. The code either relies on incidental header inclusion at the top level, or in other cases there are bugs with ObjC categories and protocols in LSV mode. I'm about to update this patch to actually switch the default to no LSV for ObjC because of how big the issue is. Most ObjC code won't actually work in LSV mode.

We'd like to eventually fix this, but for now it's just not useful, and the previous default for ObjC++ was no LSV.

This is a problem because some existing ObjectiveC code is not compatible with LSV

Hmm, how is that true? Does this code only build with Clang Header Modules enabled, and can't build without that? (if it could build without it, I don't know why it'd need LSV, if I'm understanding all these things correctly... )

It only builds without modules or without LSV.

Perhaps I've got things confused, but my understanding of LSV was that it prevented other headers in the same modulemap from leaking into the use/inclusion of one header in a module. But that indirect inclusions were still valid/exposed (eg: header A, B, and C in a module, A includes C - without LSV, including A also incidentally exposes B, but with or without LSV including A does expose C).

My understanding was that LSV wouldn't reject anything that non-modular builds accepted, but would prevent modules from accepting things that non-modular builds rejects. (so LSV is a way to remove modular build false-accepts, basically)

Have I got that wrong in some way(s)?

The code either relies on incidental header inclusion at the top level, or in other cases there are bugs with ObjC categories and protocols in LSV mode.

Bugs/incompleteness with LSV+ObjC I could believe/understand, and if that's the main issue I guess that's a choice that can be made about which bugs are worth fixing, etc, and maybe we should document the LSV change as working around those bugs for now (perhaps forever - if the benefits of LSV aren't worth the investment in fixing those bugs).

Perhaps I've got things confused, but my understanding of LSV was that it prevented other headers in the same modulemap from leaking into the use/inclusion of one header in a module. But that indirect inclusions were still valid/exposed (eg: header A, B, and C in a module, A includes C - without LSV, including A also incidentally exposes B, but with or without LSV including A does expose C).

Regardless of LSV, importing A doesn't incidentally expose B. For exposing C, module A would need to re-export C.

My understanding was that LSV wouldn't reject anything that non-modular builds accepted,

LSV rejects things that both non-LSV modular and textual builds accept:

// RUN: rm -rf %t
// RUN: split-file %s %t

//--- module.modulemap
module M {
  module S1 { header "s1.h" }
  module S2 { header "s2.h" }
}

//--- s1.h
#define T1 int

//--- s2.h
T1 foo();

//--- tu.c
#include "s1.h"
#include "s2.h"

// Let's see how "T1" becomes available in "s2.h" with different configurations.

// Textual build: succeeds, "tu.c" includes "s1.h" before "s2.h".
// RUN: %clang_cc1 -fsyntax-only %t/tu.c

// Modular build: succeeds, the entire compilation of M shares one preprocessing context,
//                which includes "s1.h" before "s2.h" due to their order in the module map.
// RUN: %clang_cc1 -fsyntax-only %t/tu.c -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache

// Modular build with LSV: fails, compilation of M creates separate context for each submodule,
//                         so "s2.h" does not see symbols of "s1.h" (without explicitly importing it).
// RUN: %clang_cc1 -fsyntax-only %t/tu.c -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache -fmodules-local-submodule-visibility

but would prevent modules from accepting things that non-modular builds rejects. (so LSV is a way to remove modular build false-accepts, basically)

This is true:

...

//--- tu.c
#include "s2.h"

// Textual build: fails, "s2.h" has no way of knowing about "T1".
// RUN: %clang_cc1 -fsyntax-only %t/tu.c

// Modular build: succeeds, "T1" is available in "s2.h" as a leftover from compiling "s1.h".
// RUN: %clang_cc1 -fsyntax-only %t/tu.c -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache

// Modular build with LSV: fails, "T1" is not available in the preprocessing context of "s2.h".
// RUN: %clang_cc1 -fsyntax-only %t/tu.c -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache -fmodules-local-submodule-visibility

Bugs/incompleteness with LSV+ObjC I could believe/understand, and if that's the main issue I guess that's a choice that can be made about which bugs are worth fixing, etc, and maybe we should document the LSV change as working around those bugs for now (perhaps forever - if the benefits of LSV aren't worth the investment in fixing those bugs).

So yeah, we probably have some bugs/incompleteness specific to LSV+ObjC, but our SDK also contains code that compiles fine with both textual inclusion and modules, but fails with modules + LSV. Having a way to disable LSV under Objective-C++20 is necessary for us in that case. Note that this patch doesn't change any behavior yet, it just makes -fno-modules-local-submodule-visibility available in -cc1.

Perhaps I've got things confused, but my understanding of LSV was that it prevented other headers in the same modulemap from leaking into the use/inclusion of one header in a module. But that indirect inclusions were still valid/exposed (eg: header A, B, and C in a module, A includes C - without LSV, including A also incidentally exposes B, but with or without LSV including A does expose C).

Regardless of LSV, importing A doesn't incidentally expose B. For exposing C, module A would need to re-export C.

My understanding was that LSV wouldn't reject anything that non-modular builds accepted,

LSV rejects things that both non-LSV modular and textual builds accept:

// RUN: rm -rf %t
// RUN: split-file %s %t

//--- module.modulemap
module M {
  module S1 { header "s1.h" }
  module S2 { header "s2.h" }
}

//--- s1.h
#define T1 int

//--- s2.h
T1 foo();

//--- tu.c
#include "s1.h"
#include "s2.h"

// Let's see how "T1" becomes available in "s2.h" with different configurations.

// Textual build: succeeds, "tu.c" includes "s1.h" before "s2.h".
// RUN: %clang_cc1 -fsyntax-only %t/tu.c

// Modular build: succeeds, the entire compilation of M shares one preprocessing context,
//                which includes "s1.h" before "s2.h" due to their order in the module map.
// RUN: %clang_cc1 -fsyntax-only %t/tu.c -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache

// Modular build with LSV: fails, compilation of M creates separate context for each submodule,
//                         so "s2.h" does not see symbols of "s1.h" (without explicitly importing it).
// RUN: %clang_cc1 -fsyntax-only %t/tu.c -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache -fmodules-local-submodule-visibility

Ah, I guess that's sort of incidental, though - the non-modular code is brittle (has a header order dependency) & would be made non-brittle by including "s1.h" from "s2.h"

but yeah, fair point - -fmodules-local-submodule-visibility does cause false negatives, in the sense that code that compiled without modules would be rejected with modules.

but would prevent modules from accepting things that non-modular builds rejects. (so LSV is a way to remove modular build false-accepts, basically)

This is true:

...

//--- tu.c
#include "s2.h"

// Textual build: fails, "s2.h" has no way of knowing about "T1".
// RUN: %clang_cc1 -fsyntax-only %t/tu.c

// Modular build: succeeds, "T1" is available in "s2.h" as a leftover from compiling "s1.h".
// RUN: %clang_cc1 -fsyntax-only %t/tu.c -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache

// Modular build with LSV: fails, "T1" is not available in the preprocessing context of "s2.h".
// RUN: %clang_cc1 -fsyntax-only %t/tu.c -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache -fmodules-local-submodule-visibility

Right right

Bugs/incompleteness with LSV+ObjC I could believe/understand, and if that's the main issue I guess that's a choice that can be made about which bugs are worth fixing, etc, and maybe we should document the LSV change as working around those bugs for now (perhaps forever - if the benefits of LSV aren't worth the investment in fixing those bugs).

So yeah, we probably have some bugs/incompleteness specific to LSV+ObjC, but our SDK also contains code that compiles fine with both textual inclusion and modules, but fails with modules + LSV. Having a way to disable LSV under Objective-C++20 is necessary for us in that case. Note that this patch doesn't change any behavior yet, it just makes -fno-modules-local-submodule-visibility available in -cc1.

*nod* Thanks for walking me through it though. I wasn't aware of the rejects-'valid' case. Makes sense that some legacy codebase needs this feature to continue to build. I was only aware of/thinking of the second case where non-LSV would accept-invalid & that would imply a codebase that could /only/ compile with modules, which I was surprised by - I understand my misunderstanding now.

Carry on!

Bigcheese updated this revision to Diff 489970.Jan 17 2023, 4:08 PM
Bigcheese retitled this revision from [Clang][cc1] Add -fno-modules-local-submodule-visibility to override the default to [Clang][cc1] Make -fno-modules-local-submodule-visibility the default for ObjC++20.
Bigcheese edited the summary of this revision. (Show Details)