This is an archive of the discontinued LLVM Phabricator instance.

[RFC][libc++] Reduce the size of translation units
AbandonedPublic

Authored by Mordante on Aug 7 2023, 10:12 AM.

Details

Reviewers
jdoerfert
EricWF
Group Reviewers
Restricted Project
Summary

The C++ standard library adds new features in every C++ version. To
implement these new features libc++ adds more includes to a header. This
means the number of transitive includes grows with newer language version.

The growing size is an issue for users. This change is motivated by a
report of the Chromium team where properly enabling C++23's
std::formatter<std::vector<bool>::reference> had a significant impact on
their build time.

In D149543 Hans Wennborg reported the following

In Chromium we noticed that this almost doubled the preprocessed size
of <vector>, from ca 1.6 MB to 3.2 MB. Since it's a widely included
header, that results in ca 8 GB (2.5%) of extra code to compile during a
full build.

This is an experiment to see how libc++ can reduce this size by only
"enabling" headers per language version. This is mainly done at the level
of the granularized includes, since they often have features for one or a
limited set of minimum language versions.

The compare header is an exception, its inclusion is often mandated by the
standard so this header is entirely guarded by C++20 or newer.

At the moment the removed transitive includes are not conditionally
restored like we usually do; this is intentional for the RFC. It gives a
good view what changes.

For this patch only vector is adjusted and mainly based on what I recall
being based on a language version.

I did some testing with the size of a preprocessed file that contains

#include <vector>

I used the following command

clang++ -E test.cpp -nostdinc++ -I <build>/include/c++/v1" -I -std=$std |wc -l
VersionBeforeAfter
---------+--------+-------
C++035315338721
C++115566540914
C++145670941896
C++176059144957
C++207590575863
C++236186761828
C++266186761828

The interesting value is C++23 (before) there the size drops a lot. In
C++23 some transitive includes are removed unconditionally. These can be
removed with a build option. The Chromimum team uses this option. This is
done by the following command

clang++ -E test.cpp -nostdinc++ -I <build>/include/c++/v1" -I -std=$std -D_LIBCPP_REMOVE_TRANSITIVE_INCLUDES |wc -l
VersionBeforeAfter
---------+--------+-------
C++034588318482
C++114836619352
C++144935219745
C++175159022331
C++206125961217
C++236186761828
C++266186761828

Diff Detail

Event Timeline

Mordante created this revision.Aug 7 2023, 10:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2023, 10:12 AM
Mordante requested review of this revision.Aug 7 2023, 10:12 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
Mordante added inline comments.Aug 7 2023, 10:15 AM
libcxx/include/__format/formatter.h
14

Note <vector> includes this in C++20 even when it's only used in C++23. So in <vector> we could make this include optional.

EricWF requested changes to this revision.Aug 11 2023, 4:25 AM
EricWF added a subscriber: EricWF.

I think guarded includes is a really bad idea. I think it's inviting us to create include cycles that are only present in certain dialects & configurationsn.
I would much rather pay the size increase than do this.

This revision now requires changes to proceed.Aug 11 2023, 4:25 AM

I was really excited about this when I saw the numbers initially. However, the benefits of this patch seem to go down very drastically after C++20, so I am left a bit unsure whether it's worth making this wide-ranging change in libc++. Indeed, we want to push our users to newer standard versions -- for example Clang now defaults to C++17, and eventually will default to C++20. In light of this, I am not sure it is worth shuffling around so much code for a benefit that will go away. Instead, I think the most reliable way of solving compilation time issues is going to be for users to adopt modules. I know this is a really frustrating answer to a lot of people, but concretely that's what WG21 is pushing people towards. For example, when they specify that <vector> should include <format>, they are thinking "yeah users will use modules anyway", otherwise they would make different design choices. So basically I think we might be fighting against the current if we try to optimize build times this way instead.

Mordante abandoned this revision.Nov 3 2023, 12:01 PM

@EricWF @ldionne Thanks for the feedback. This was intended as an RFC and an experiment to see what's possible.
Personally I was also not convinced this is worth the effort and I have not heard from the Chromium team since.

So the patch has served its purpose and I'll abandon it.