Page MenuHomePhabricator

[Preprocessor] Reduce the memory overhead of `#define` directives

Authored by arphaman on Fri, Jan 14, 11:14 AM.



Recently we observed high memory pressure caused by clang during some parallel builds. We discovered that we have several projects that have a large number of #define directives in their TUs (on the order of millions), which caused huge memory consumption in clang due to a lot of allocations for MacroInfo. We would like to reduce the memory overhead of clang for a single #define to reduce the memory overhead for these files, to allow us to reduce the memory pressure on the system during highly parallel builds. This change achieves that by removing the SmallVector in MacroInfo and instead storing the tokens in an array allocated using the bump pointer allocator, after all tokens are lexed.

The added unit test with 1000000 #define directives illustrates the problem. Prior to this change, on arm64 macOS, clang's PP bump pointer allocator allocated 272007616 bytes, and used roughly 272 bytes per #define. After this change, clang's PP bump pointer allocator allocates 120002016 bytes, and uses only roughly 120 bytes per #define.

For an example test file that we have internally with 7.8 million #define directives, this change produces the following improvement on arm64 macOS: Persistent allocation footprint for this test case file as it's being compiled to LLVM IR went down 22% from 5.28 GB to 4.07 GB and the total allocations went down 14% from 8.26 GB to 7.05 GB. Furthermore, this change reduced the total number of allocations made by the system for this clang invocation from 1454853 to 133663, an order of magnitude improvement.

Diff Detail

Unit TestsFailed

960 msx64 debian > SanitizerCommon-tsan-x86_64-Linux.Linux::decorate_proc_maps.cpp
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -gline-tables-only -fsanitize=thread -m64 -funwind-tables -ldl -g /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/sanitizer_common/TestCases/Linux/decorate_proc_maps.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/sanitizer_common/tsan-x86_64-Linux/Linux/Output/decorate_proc_maps.cpp.tmp

Event Timeline

arphaman created this revision.Fri, Jan 14, 11:14 AM
arphaman requested review of this revision.Fri, Jan 14, 11:14 AM
This revision is now accepted and ready to land.Fri, Jan 14, 2:41 PM

Just some minor nits from me, but generally LG.


I think this should be a const_tokens_iterator instead (and it's fine that we don't expose a non-const interface for the iterator).


Should we assert that we've not already allocated tokens before?


Should we do this dance for 32-bit systems as well?


Please spell out the type.

dexonsmith added inline comments.

Do I remember correctly that SourceLocation's size recently became configurable? Or maybe it will be soon? Should that be factored in somehow?

aaron.ballman added inline comments.Wed, Jan 19, 7:25 AM

Are you thinking about this review or something else?