This is an archive of the discontinued LLVM Phabricator instance.

[libcxx][NFC] replaces `<utility>` includes with specific headers
AbandonedPublic

Authored by cjdb on Jun 26 2021, 9:30 AM.

Details

Reviewers
ldionne
zoecarver
Mordante
EricWF
Group Reviewers
Restricted Project
Summary

Depends on D104942.

Diff Detail

Event Timeline

cjdb requested review of this revision.Jun 26 2021, 9:30 AM
cjdb created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJun 26 2021, 9:30 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
cjdb updated this revision to Diff 354705.Jun 26 2021, 1:48 PM

rebases to activate CI (I think it flaked)

cjdb updated this revision to Diff 354712.Jun 26 2021, 4:23 PM

found missing header

cjdb updated this revision to Diff 354720.Jun 26 2021, 8:07 PM

fixes benchmarks

ldionne requested changes to this revision.Jun 28 2021, 9:01 AM
ldionne added inline comments.
libcxx/include/__functional/greater.h
52–55 ↗(On Diff #354720)

What's the rationale for making this change here? Why not include __utility/forward.h?

libcxx/include/type_traits
1366–1375

Why this? And most importantly, why is it in this patch?

This revision now requires changes to proceed.Jun 28 2021, 9:01 AM
cjdb added inline comments.Jun 28 2021, 10:15 AM
libcxx/include/__functional/greater.h
52–55 ↗(On Diff #354720)

Oh, did this slip through? I was experimenting with something and then abandoned it. Will fix.

libcxx/include/type_traits
1366–1375

Seems I did a copy/paste instead of cut/paste. declval already had its own header but was never removed from type_traits, and we're not violating the ODR by declaring it twice in two different places (but diagnostics point to <type_traits> instead of <utility>).

cjdb updated this revision to Diff 356205.Jul 2 2021, 10:00 AM

rebases to acivate CI

cjdb updated this revision to Diff 356292.Jul 2 2021, 4:12 PM

adds <cstddef> and qualifies std::nullptr_t

Nice cleanup!

libcxx/include/__functional/greater.h
52–55 ↗(On Diff #354720)

I noticed the same change at other places too.

libcxx/include/__split_buffer
6

Please sort the headers.

libcxx/include/algorithm
655

Can you remove the duplicated includes and sort them. Not part of your change, but would be nice to do it along in this patch,

libcxx/include/mutex
200

D104942 landed __functional/invoke.h, can you use that? If not please update the comment.

cjdb marked an inline comment as done.Jul 3 2021, 11:29 AM
cjdb added inline comments.
libcxx/include/__functional/greater.h
52–55 ↗(On Diff #354720)

It's moot because these files don't exist any more.

libcxx/include/mutex
200

Out of scope: I'll apply it in D104982.

cjdb marked an inline comment as done.Jul 3 2021, 11:33 AM
cjdb added inline comments.
libcxx/include/mutex
200

After further inspection, __invoke still lives in type_traits. Won't fix in D104982.

ldionne requested changes to this revision.Jul 5 2021, 9:37 AM
ldionne added inline comments.
libcxx/include/__functional/identity.h
27

Same here, this shouldn't be part of this change.

libcxx/include/concepts
216

Same.. those shouldn't be part of this patch (and if you want to do it throughout in a different patch, let's discuss beforehand).

libcxx/include/module.modulemap
716

What happens if you don't export __tuple?

This revision now requires changes to proceed.Jul 5 2021, 9:37 AM
cjdb added inline comments.Jul 5 2021, 10:46 AM
libcxx/include/concepts
216

Hmm... I didn't realise I did it to so many things. I owe @Mordante an apology in that case.

I'll do an audit before uploading a new revision.

libcxx/include/module.modulemap
716

Bad things. Seriously though, it means that whatever standard stuff pair.h exports which currently lives in __tuple won't be transitively exported. I haven't properly looked into what __tuple owns, but there are a few things, and we'll need to detangle them from tuple's dependencies.

Both of those tasks are far lower on my P-list: I'm trying to focus more on improving potential UX for our existing header splices as well as implementing ranges algos.

Mordante added inline comments.Jul 5 2021, 12:15 PM
libcxx/include/module.modulemap
716

@cjdb This sounds like the same issue I have with <optional> in <__format/format_context.h> (D103357)

EricWF requested changes to this revision.Jul 7 2021, 11:37 AM
EricWF added a subscriber: EricWF.

So we're splitting up the headers (which I don't love, but will hold my tongue for the most part).

The intention is to provide better name hygene for our users. Meaning headers will export _fewer_ names than they do at present.
This is an intentional BREAKING change. Users will have to update their code to include the correct headers before it will compile.

Before we do this we need to:

  1. Announce and discuss this change, it's rational, and its benefits to *all libc++ vendors*.
  2. Document the list of names removed from every header. This list should be complete.
  3. Ensure Clang's error messages tell users _which_ headers to include. Or at least don't tell users to include the wrong (implementation) headers.
cjdb added a comment.Jul 7 2021, 11:43 AM

So we're splitting up the headers (which I don't love, but will hold my tongue for the most part).

The intention is to provide better name hygene for our users. Meaning headers will export _fewer_ names than they do at present.
This is an intentional BREAKING change. Users will have to update their code to include the correct headers before it will compile.

Before we do this we need to:

  1. Announce and discuss this change, it's rational, and its benefits to *all libc++ vendors*.
  2. Document the list of names removed from every header. This list should be complete.
  3. Ensure Clang's error messages tell users _which_ headers to include. Or at least don't tell users to include the wrong (implementation) headers.

Agreed. Specifically responding to (3): D105439 is the first of a few patches that I'm working on to get Clang to minimise the pain from breakages and prevent users from making certain mistakes in the interim.

So we're splitting up the headers (which I don't love, but will hold my tongue for the most part).

The intention is to provide better name hygene for our users. Meaning headers will export _fewer_ names than they do at present.
This is an intentional BREAKING change. Users will have to update their code to include the correct headers before it will compile.

Before we do this we need to:

  1. Announce and discuss this change, it's rational, and its benefits to *all libc++ vendors*.
  2. Document the list of names removed from every header. This list should be complete.
  3. Ensure Clang's error messages tell users _which_ headers to include. Or at least don't tell users to include the wrong (implementation) headers.

It is indeed a breaking change, albeit one that is easy to remediate for users. But on a large scale, I agree it can be challenging to tackle.

While there is a clear benefit in making the headers less monolithic, minimizing the includes themselves is, as someone who just wants to ship an updated libc++, primarily a pain point. I'm willing to play ball and go through the pain of fixing my direct users (even though that's a huge pain), but if another vendor finds this change too disruptive, I can also get behind that. At the end of the day, the pain is real, but the benefit has a component of "pedantic correctness" to it. It would be a lot easier to make this change if we somehow could make it gradually, but I think the only way to do that would be for users to make sure their code always builds with modules enabled and disabled.

cjdb planned changes to this revision.Jul 7 2021, 1:17 PM

So we're splitting up the headers (which I don't love, but will hold my tongue for the most part).

The intention is to provide better name hygene for our users. Meaning headers will export _fewer_ names than they do at present.
This is an intentional BREAKING change. Users will have to update their code to include the correct headers before it will compile.

Before we do this we need to:

  1. Announce and discuss this change, it's rational, and its benefits to *all libc++ vendors*.
  2. Document the list of names removed from every header. This list should be complete.
  3. Ensure Clang's error messages tell users _which_ headers to include. Or at least don't tell users to include the wrong (implementation) headers.

It is indeed a breaking change, albeit one that is easy to remediate for users. But on a large scale, I agree it can be challenging to tackle.

While there is a clear benefit in making the headers less monolithic, minimizing the includes themselves is, as someone who just wants to ship an updated libc++, primarily a pain point. I'm willing to play ball and go through the pain of fixing my direct users (even though that's a huge pain), but if another vendor finds this change too disruptive, I can also get behind that. At the end of the day, the pain is real, but the benefit has a component of "pedantic correctness" to it. It would be a lot easier to make this change if we somehow could make it gradually, but I think the only way to do that would be for users to make sure their code always builds with modules enabled and disabled.

I'm starting to form a longer-term plan, so it would be best if we held off on merging this (and its children) till after the branch point at the end of the month. I have no idea how long it will take for my ideas to get integrated into Clang (and other relevant LLVM tools, if necessary). @ldionne I think it would also be good for us to not immediately fix anyone, so that we can get deployment experience (across multiple customers) for the features I end up implementing. Then we can publish a guide explaining how users ought to migrate to libc++next.

cjdb added a comment.Jul 15 2021, 10:42 AM

There might be a potential workaround for this effort that lets us make progress, while also not throwing the world under the bus. <Windows.h> has a WIN32_LEAN_AND_MEAN macro that users can opt into if they want a slimmer <Windows.h>. We could have a similar macro that users opt into if they want slimmer libc++ headers for the time being. Once we've gotten tooling to a spot where we're happy users have automated remediation, we can then remove our use of this macro, and deprecate _LIBCPP_LEAN_AND_MEAN.

In summary: no one gets broken, anyone who wants slimmer headers gets them, and we eventually get to the "ideal" state. @ldionne, @EricWF, WDYT?

There might be a potential workaround for this effort that lets us make progress, while also not throwing the world under the bus. <Windows.h> has a WIN32_LEAN_AND_MEAN macro that users can opt into if they want a slimmer <Windows.h>. We could have a similar macro that users opt into if they want slimmer libc++ headers for the time being. Once we've gotten tooling to a spot where we're happy users have automated remediation, we can then remove our use of this macro, and deprecate _LIBCPP_LEAN_AND_MEAN.

In summary: no one gets broken, anyone who wants slimmer headers gets them, and we eventually get to the "ideal" state. @ldionne, @EricWF, WDYT?

Part of me wants to say "no, let's not add yet another configuration that'll only make things more complicated", and part of me wants to say "yes, that will allow us to keep making progress on this header minimization without breaking users".

However, since there is very little benefit to minimizing the headers if nobody's going to turn on that LEAN_AND_MEAN flag, I'm not sure. Instead, here's what I think would be best:

  1. Minimize the includes in all the detail headers, and in the public headers too.
  2. However, for "backwards compatibility", keep the unnecessary includes in the public headers, but group them and put them at the end of all the #includes:
#include <__utility/whatever.h>
#include <__ranges/whatever.h>

// Kept for backwards compatibility until we can remove them
#include <utility>
#include <whatever>

Eventually, when we're ready to make the move, we'll be able to very easily make the jump for everyone by just removing those stray includes from all public headers. We might need to do a few adjustments before we can drop them (because as time passes, we might sometimes start relying on a few transitive includes by mistake here and there), but that should be much easier to do than what we're doing now.

I guess I'm just somewhat reluctant to adding yet another user-visible knob.

libcxx/include/algorithm
655

I would definitely wait for a separate NFC commit, since doing it here will obscure the changes we're making in this patch, which are all precisely #include-related.

cjdb added a comment.Jul 15 2021, 12:14 PM

There might be a potential workaround for this effort that lets us make progress, while also not throwing the world under the bus. <Windows.h> has a WIN32_LEAN_AND_MEAN macro that users can opt into if they want a slimmer <Windows.h>. We could have a similar macro that users opt into if they want slimmer libc++ headers for the time being. Once we've gotten tooling to a spot where we're happy users have automated remediation, we can then remove our use of this macro, and deprecate _LIBCPP_LEAN_AND_MEAN.

In summary: no one gets broken, anyone who wants slimmer headers gets them, and we eventually get to the "ideal" state. @ldionne, @EricWF, WDYT?

Part of me wants to say "no, let's not add yet another configuration that'll only make things more complicated", and part of me wants to say "yes, that will allow us to keep making progress on this header minimization without breaking users".

However, since there is very little benefit to minimizing the headers if nobody's going to turn on that LEAN_AND_MEAN flag, I'm not sure. Instead, here's what I think would be best:

Someone from the Chrome toolchain team sent me an email this morning asking if we can trim our headers, since their include graph blames our user-facing headers as a huge build time sink. My news to them about our current stalling left a bad taste in my mouth, but I was inspired by something that they said about D90072 to think creatively, and then remembered WIN32_LEAN_AND_MEAN. Since they reached out to me, this has gone from "I think this is a space to improve" to "there is a customer requesting something in this space". If they like the idea, I think it's worth entertaining, since Chrome takes forever to build. I've asked someone from that team to chime in as well.

  1. Minimize the includes in all the detail headers, and in the public headers too.
  2. However, for "backwards compatibility", keep the unnecessary includes in the public headers, but group them and put them at the end of all the #includes:
#include <__utility/whatever.h>
#include <__ranges/whatever.h>

// Kept for backwards compatibility until we can remove them
#include <utility>
#include <whatever>

Eventually, when we're ready to make the move, we'll be able to very easily make the jump for everyone by just removing those stray includes from all public headers. We might need to do a few adjustments before we can drop them (because as time passes, we might sometimes start relying on a few transitive includes by mistake here and there), but that should be much easier to do than what we're doing now.

If we're vigilant about our modules build (and I think we're starting to automate a check in that space), then this should only happen while we're including user-facing headers.

I guess I'm just somewhat reluctant to adding yet another user-visible knob.

I don't really like it any more than you do, but it's a user-visible knob with a very short life expectancy.

Someone from the Chrome toolchain team sent me an email this morning asking if we can trim our headers, since their include graph blames our user-facing headers as a huge build time sink.

Compared to all the other stuff in a big project like Chrome, blaming libc++ surprises me. Are they claiming that Chrome builds faster against libstdc++, then? Any chance of getting them to go public with their measurements? And they're blaming specifically the size of the headers (including, I should add, the disk traffic from opening and closing all these little files), not anything about the actual contents/implementation? E.g. I myself have done a blog post titled "37% of HyperRogue's compilation time is due to std::function" (2019-01-06), but that's nothing to do with disk traffic nor compiler-internal data structures, and everything to do with the bloated std::function API/ABI itself.

I don't really like it any more than you do, but it's a user-visible knob with a very short life expectancy.

User-visible features never really have short life expectancies. ISTR someone talking about "Hyrum's Law" on this very Phabricator instance, fairly recently... ;)

cjdb added a comment.Jul 15 2021, 2:10 PM

Someone from the Chrome toolchain team sent me an email this morning asking if we can trim our headers, since their include graph blames our user-facing headers as a huge build time sink.

Compared to all the other stuff in a big project like Chrome, blaming libc++ surprises me. Are they claiming that Chrome builds faster against libstdc++, then?

No. Chrome isn't built against libstdc++ (I'm not sure if it's even possible). Granted, libstdc++ is out of scope.

Any chance of getting them to go public with their measurements?

Good question, but I don't have an answer for you.

And they're blaming specifically the size of the headers (including, I should add, the disk traffic from opening and closing all these little files), not anything about the actual contents/implementation? E.g. I myself have done a blog post titled "37% of HyperRogue's compilation time is due to std::function" (2019-01-06), but that's nothing to do with disk traffic nor compiler-internal data structures, and everything to do with the bloated std::function API/ABI itself.

I'll also leave this question to the Chrome toolchain team to answer, but it seems to me that the little files are hardly a blip on the radar, and that a dominating factor is that we're (libc++) unnecessarily transitively including a lot more than we need to be.

I don't really like it any more than you do, but it's a user-visible knob with a very short life expectancy.

User-visible features never really have short life expectancies. ISTR someone talking about "Hyrum's Law" on this very Phabricator instance, fairly recently... ;)

This won't end up being a Hyrumbug. Hyrum's law states that given a sufficiently large user base and enough time, anything that is observable eventually becomes the interface, even if it's intended to be an implementation detail, which will lead to breaks on anything that's public facing. In this case, we'd have a public-facing macro (so it's already exempt), that would be deprecated when it's no longer necessary (so users stop defining it), and even once we remove said deprecation notice, it won't break anyone. It'll just be a benign eyesore in some code.

hans added a subscriber: hans.Jul 20 2021, 8:26 AM

However, since there is very little benefit to minimizing the headers if nobody's going to turn on that LEAN_AND_MEAN flag, I'm not sure. Instead, here's what I think would be best:

Someone from the Chrome toolchain team sent me an email this morning asking if we can trim our headers, since their include graph blames our user-facing headers as a huge build time sink. My news to them about our current stalling left a bad taste in my mouth, but I was inspired by something that they said about D90072 to think creatively, and then remembered WIN32_LEAN_AND_MEAN. Since they reached out to me, this has gone from "I think this is a space to improve" to "there is a customer requesting something in this space". If they like the idea, I think it's worth entertaining, since Chrome takes forever to build. I've asked someone from that team to chime in as well.

If there were a LIBCPP_LEAN_AND_MEAN flag which reduced the header sizes meaningfully, we'd certainly use it.

In order to enable it, having Clang diagnostics with good fixit hints would certainly help, but we could probably do without it too with some scripting, so don't block on that on our account.

Someone from the Chrome toolchain team sent me an email this morning asking if we can trim our headers, since their include graph blames our user-facing headers as a huge build time sink.

Compared to all the other stuff in a big project like Chrome, blaming libc++ surprises me. Are they claiming that Chrome builds faster against libstdc++, then? Any chance of getting them to go public with their measurements? And they're blaming specifically the size of the headers (including, I should add, the disk traffic from opening and closing all these little files), not anything about the actual contents/implementation? E.g. I myself have done a blog post titled "37% of HyperRogue's compilation time is due to std::function" (2019-01-06), but that's nothing to do with disk traffic nor compiler-internal data structures, and everything to do with the bloated std::function API/ABI itself.

We're not blaming our slow builds on libc++, but we're looking everywhere we can, and if the libc++ headers could be made slimmer, it would certainly help since we rely on many of them almost everywhere, and I'd expect it would be a win for other projects too.

The numbers we're looking at are public, and can be found here: https://commondatastorage.googleapis.com/chromium-browser-clang/include-analysis.html That doesn't measure compile time, but compiler input size. In other measurements we've seen a strong correlation between compiler input size and compile time. It's not surprising that libc++ headers are on top since they're included everywhere, but that also means there's a high chance slimmer libc++ headers would have measurable impact for us.

hans added a comment.Jul 20 2021, 9:08 AM

The numbers we're looking at are public, and can be found here: https://commondatastorage.googleapis.com/chromium-browser-clang/include-analysis.html That doesn't measure compile time, but compiler input size. In other measurements we've seen a strong correlation between compiler input size and compile time. It's not surprising that libc++ headers are on top since they're included everywhere, but that also means there's a high chance slimmer libc++ headers would have measurable impact for us.

The same analysis for an LLVM build is available here: https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html though that uses libstdc++ (and hasn't been updated in a while). The standard headers show up on top, and I assume that's common across most c++ projects.

cjdb abandoned this revision.Sep 7 2022, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 7 2022, 10:59 AM