This is an archive of the discontinued LLVM Phabricator instance.

[libcxx][modules] protects users from relying on libc++ detail headers (1/n)
ClosedPublic

Authored by cjdb on Jul 13 2021, 12:49 PM.

Details

Summary

libc++ has started splicing standard library headers into much more
fine-grained content for maintainability. It's very likely that outdated
and naive tooling (some of which is outside of LLVM's scope) will
suggest users include things such as <__algorithm/find.h> instead of
<algorithm>, and Hyrum's law suggests that users will eventually begin
to rely on this without the help of tooling. As such, this commit
intends to protect users from themselves, by making it a hard error for
anyone outside of the standard library to include libc++ detail headers.

This is the first of four patches. Patch #2 will solve the problem for
pre-processor #includes; patches #3 and #4 will solve the problem for
<__tree> and <__hash_table> (since I've never touched the test cases
that are failing for these two, I want to split them out into their own
commits to be extra careful). Patch #5 will concern itself with
<__threading_support>, which intersects with libcxxabi (which I know
even less about).

Diff Detail

Event Timeline

cjdb created this revision.Jul 13 2021, 12:49 PM
cjdb requested review of this revision.Jul 13 2021, 12:49 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
cjdb added a comment.Jul 13 2021, 12:50 PM

This was originally the Clang patch D105439. Some context from over there is relevant to the discussion here.

ldionne accepted this revision.Jul 13 2021, 12:57 PM

This seems like an elegant and straightforward approach to me. LGTM if CI passes.

This revision is now accepted and ready to land.Jul 13 2021, 12:57 PM
Quuxplusone accepted this revision.Jul 13 2021, 1:01 PM
Quuxplusone added a subscriber: Quuxplusone.

I predict controversy with patch #2. ;) But this certainly looks correct for the Modules part (and #3 and #4 should be equally straightforward).

Mordante accepted this revision.Jul 13 2021, 1:19 PM
Mordante added a subscriber: Mordante.

This seems like an elegant and straightforward approach to me. LGTM if CI passes.

+1

cjdb updated this revision to Diff 358504.Jul 13 2021, 9:08 PM

rolls back __mutex_base to see if CI will now pass

cjdb updated this revision to Diff 358534.Jul 14 2021, 12:10 AM
  • re-adds the stuff I removed
  • adds <span> and <shared_mutex> to the module map
    • fixes tests to account for this
cjdb updated this revision to Diff 358634.Jul 14 2021, 9:15 AM
  • corrects <span>
  • fixes libcxxabi breaks

Reviwers will probably want to take a look, but I'm going to ask that you hold off till CI is 100% green (or we just have that pesky Apple flake). libcxxabi might result in some churn since I'm unable to test it locally right now, so there could be two or three more commits before I get it completely right.

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2021, 9:15 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
This revision now requires review to proceed.Jul 14 2021, 9:15 AM
ldionne requested changes to this revision.Jul 14 2021, 9:28 AM

I'm going to request changes on the patch so I remember to look at it again before it ships. Please ping me when you want this reviewed.

This revision now requires changes to proceed.Jul 14 2021, 9:28 AM
cjdb updated this revision to Diff 358650.Jul 14 2021, 9:50 AM
cjdb retitled this revision from [libcxx][modules] protects users from relying on libc++ detail headers (1/4) to [libcxx][modules] protects users from relying on libc++ detail headers (1/5).
cjdb edited the summary of this revision. (Show Details)
cjdb removed a project: Restricted Project.
  • completely rolls back libcxxabi changes
  • makes <__threading_support> a "public" header again
  • expands number of commits this will take from 4 to 5 to account for <__threading_support>.

I don't see why libcxxabi couldn't instead include <thread> to get what's needed out of <__threading_support>, but I'm not convinced that's the right path forward without giving it some thought. Probably best to defer that to another patch where it owns the spotlight.

cjdb added a comment.Jul 14 2021, 10:16 AM

@ldionne the modules build is green, so I'm going to pretend the rest passes for the purposes of asking you to re-review (I will of course wait for CI to actually be (green % Apple thread) flake before merging).

ldionne accepted this revision.Jul 14 2021, 11:44 AM

The Apple failure is caused by your patch not being quite rebased onto main. Feel free to ship as-is.

This revision is now accepted and ready to land.Jul 14 2021, 11:44 AM
ldionne added a comment.EditedJul 14 2021, 11:47 AM

The Apple failure is caused by your patch not being quite rebased onto main. Feel free to ship as-is.

Phabricator glitch.

cjdb updated this revision to Diff 358849.Jul 14 2021, 11:54 PM
cjdb retitled this revision from [libcxx][modules] protects users from relying on libc++ detail headers (1/5) to [libcxx][modules] protects users from relying on libc++ detail headers (1/n).

Sorry @ldionne, but I decided to add tests for this, and I'm gonna need you to give this one last cursory review.
You can get away with reading the Python script that generated all the test cases, so it's not like there's a lot of extra work: it just looks that way :)

I promise that I'll merge after this.

miscco added a subscriber: miscco.Jul 15 2021, 12:00 AM
miscco added inline comments.
libcxx/utils/generate_private_header_tests.py
19

Just to be sure, can we disable this for MSVC STL? There are no such headers there and this is rather specific to libc++

cjdb added inline comments.Jul 15 2021, 12:29 AM
libcxx/utils/generate_private_header_tests.py
19

These test are generated in libcxx/test/libcxx, which tests libc++ internals, so I don't think it's necessary to guard on that. Having said that, shouldn't REQUIRES: modules-build disable for Microsoft/STL anyway?

STL_MSFT added inline comments.
libcxx/utils/generate_private_header_tests.py
19

microsoft/STL runs only libcxx/test/std, not libcxx/test/libcxx, so this sounds perfectly fine to me.

libcxx/utils/generate_private_header_tests.py
40–50

FWIW, I'd find this easier to read as

def is_still_public(p):
    return p.startswith('__support/') or p in [
        ...
    ]

paths = [p[10:] for p in Path('include').rglob('*') if p.as_posix().startswith('include/__') and not p.is_dir()]
private_paths = [p for p in paths if not is_still_public(p)]
cjdb added inline comments.Jul 15 2021, 9:49 AM
libcxx/utils/generate_private_header_tests.py
40–50

Yes, very much so. I was extremely disappointed when I found out I couldn't do this

Path("include").rglob('*')
               .filter(...)
               .map(...)
               .filter(...)

I guess list comprehensions are the Pythonic way of achieving this.

ldionne requested changes to this revision.Jul 15 2021, 11:38 AM

Good idea about adding tests, thanks for doing that.

It does raise a couple issues that I don't think we've had before:

  1. We'll now have to re-generate the auto-gen tests very often, since we add detail headers a lot more often than we add public headers. I don't think that's an actual issue, especially since we now have the libcxx-generate-files target to do it, but it's worth keeping in mind. If it becomes a pain, we could potentially go for a different approach where the tests are generated automatically on the fly when we run the test suite. Unless that becomes a real issue, I do have a preference for something simple where we can see the generated sources directly, like today.
  1. Today, we never have to delete auto-gen tests, because we basically never remove/rename public headers. This commit changes that. Since there is some churn in our private headers (which is okay), we might start accumulating older headers that have since then been removed/renamed. I think a simple way to solve that would be to remove the whole libcxx/test/libcxx/diagnostics/detail.headers directory in your generation script. Also, if someone manually modifies one of the header tests (or tries adding their own), CI would fail in the check-generated-output CI job (since whatever they added would be removed).
libcxx/test/libcxx/diagnostics/detail.headers/algorithm/adjacent_find.module.verify.cpp
1

All your generated headers have this newline at the beginning -- they shouldn't.

I also think they should have a newline at the end (might not be required technically but we always do that).

13

This could look more like:

// WARNING: This test was generated by {script_name}
// and should not be edited manually.

That's what we do in other generated tests.

libcxx/utils/generate_private_header_tests.py
54

Awkward line break.

This revision now requires changes to proceed.Jul 15 2021, 11:38 AM
libcxx/utils/generate_private_header_tests.py
29–32

Please follow the magic incantation at the top of all the other test-generating scripts, so that this script can be run from anywhere.

56

@ldionne wrote:

Today, we never have to delete auto-gen tests, because we basically never remove/rename public headers. This commit changes that. Since there is some churn in our private headers (which is okay), we might start accumulating older headers that have since then been removed/renamed. I think a simple way to solve that would be to remove the whole libcxx/test/libcxx/diagnostics/detail.headers directory in your generation script.

Another idea (and maybe a better one for the maintainers' sanity) would be to generate just one test file per public header. Instead of generating detail.headers/utility/{__decay_copy,as_const,cmp,declval,...}.module.verify.cpp, simply generate one test file, detail.headers/utility.verify.cpp, containing the code

// expected-error@*:* {{use of private header from outside its module: '__utility/__decay_copy.h'}}
#include <__utility/__decay_copy.h>

// expected-error@*:* {{use of private header from outside its module: '__utility/as_const.h'}}
#include <__utility/as_const.h>

// expected-error@*:* {{use of private header from outside its module: '__utility/cmp.h'}}
#include <__utility/cmp.h>

and so on and so on. Then these files would change often, but be added/deleted rarely, just like all the other generated test files.

Finally: Is the .module. in foo.module.verify.cpp significant to llvm-lit? Or could we go with just foo.verify.cpp, for clarity? find ../libcxx/test/ -name \*.module.\* doesn't show any results in trunk at the moment, so I suspect it's not significant and can be removed.

cjdb added inline comments.Jul 15 2021, 3:28 PM
libcxx/test/libcxx/diagnostics/detail.headers/algorithm/adjacent_find.module.verify.cpp
1

All your generated headers have this newline at the beginning -- they shouldn't.

Is that really a concern? (I'm going to fix it, but I wanna understand why.)

I also think they should have a newline at the end (might not be required technically but we always do that).

They already do. Evidence: Phab will tell you "no newline before EOF".

libcxx/utils/generate_private_header_tests.py
56

@ldionne wrote:

Today, we never have to delete auto-gen tests, because we basically never remove/rename public headers. This commit changes that. Since there is some churn in our private headers (which is okay), we might start accumulating older headers that have since then been removed/renamed. I think a simple way to solve that would be to remove the whole libcxx/test/libcxx/diagnostics/detail.headers directory in your generation script.

Another idea (and maybe a better one for the maintainers' sanity) would be to generate just one test file per public header. Instead of generating detail.headers/utility/{__decay_copy,as_const,cmp,declval,...}.module.verify.cpp, simply generate one test file, detail.headers/utility.verify.cpp, containing the code

// expected-error@*:* {{use of private header from outside its module: '__utility/__decay_copy.h'}}
#include <__utility/__decay_copy.h>

// expected-error@*:* {{use of private header from outside its module: '__utility/as_const.h'}}
#include <__utility/as_const.h>

// expected-error@*:* {{use of private header from outside its module: '__utility/cmp.h'}}
#include <__utility/cmp.h>

and so on and so on. Then these files would change often, but be added/deleted rarely, just like all the other generated test files.

I initially did this (it was waaaay cleaner and could be done by hand too), but modules mode only reports one header error at a time, so we end up with two expected-but-not-seen errors :(

Finally: Is the .module. in foo.module.verify.cpp significant to llvm-lit? Or could we go with just foo.verify.cpp, for clarity? find ../libcxx/test/ -name \*.module.\* doesn't show any results in trunk at the moment, so I suspect it's not significant and can be removed.

It's significant for a future patch that I expect to throw up in a day or so.

cjdb updated this revision to Diff 359205.Jul 15 2021, 8:58 PM

applies suggested feedback

cjdb updated this revision to Diff 359218.Jul 15 2021, 10:10 PM

adds the CMakeLists.txt bit

ldionne accepted this revision.Jul 16 2021, 8:56 AM

LGTM but it would be great to see CI passing before committing this!

libcxx/test/libcxx/diagnostics/detail.headers/algorithm/adjacent_find.module.verify.cpp
1

It's not a concern per-se, but we might as well generate files the way we'd do it by hand.

I guess a minor concern is that you want the modline at the top to be the very first line (but I would have little love for an editor that can't figure out syntax highlighting from the .cpp extension).

libcxx/utils/CMakeLists.txt
16 ↗(On Diff #359218)

When you commit, please change this COMMENT to the imperative, like the others above.

This revision is now accepted and ready to land.Jul 16 2021, 8:56 AM
cjdb updated this revision to Diff 359347.Jul 16 2021, 9:09 AM

rebases to activate CI

cjdb updated this revision to Diff 359357.Jul 16 2021, 9:18 AM

fixes minor regex mishap

ldionne added inline comments.Jul 29 2021, 11:15 AM
libcxx/utils/CMakeLists.txt
16 ↗(On Diff #359218)

This was never done. Please make sure you read the comments when you get a LGTM, often I request changes but still give the green check mark to avoid stalling the patch on something trivial.

In this case it's really not a big deal, but ideally this wouldn't happen with more important comments. I'll make this change as a NFC.

cjdb added inline comments.Jul 29 2021, 11:30 AM
libcxx/utils/CMakeLists.txt
16 ↗(On Diff #359218)

Sorry, I clearly missed this comment.