This is an archive of the discontinued LLVM Phabricator instance.

[Serialization] Compress serialized macro expansion SLocEntries
Needs ReviewPublic

Authored by sammccall on Apr 25 2022, 3:47 PM.

Details

Reviewers
ilya-biryukov
Summary

Macro expansion SLocEntries are significant to PCH size (7-10% in my tests).
These store the expansion end location as vbr8. However it's highly predictable:

  • for macro arg expansion it's zero
  • for object macros it equals the expansion start
  • for function macros it's usually shortly after the expansion start

Instead, this change stores (bool relative, unsigned value).
If relative is true, ExpEnd is ExpBegin+value, otherwise it's just value.
We define abbreviations to cover the common cases above.

This saves ~15% of SM_SLOC_EXPANSION, which is 1-1.5% of overall PCH size.

Diff Detail

Event Timeline

sammccall created this revision.Apr 25 2022, 3:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 3:47 PM
sammccall requested review of this revision.Apr 25 2022, 3:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 3:47 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
ilya-biryukov added inline comments.Apr 26 2022, 1:20 AM
clang/lib/Serialization/ASTWriter.cpp
2181

NIT: maybe avoid the lambda? The code is short enough to be readable without early returns:

unsigned Abbrev = 0;
if (Expansion.isExpansionTokenRange()) {
  if (Expansion.isMacroArgExpansion())
    Abbrev = SLocArgExpansionAbbrv;
  else if (EndIsRelative && Expansion.isFunctionMacroExpansion())
    Abbrev = SLocFunctionExpansionAbbrv;
  else if (EndIsRelative)
    Abbrev = SLocObjectExpansionAbbrv;
}
2195

It's probably obvious, but I'm not an expert in bitcode format.

Can we have larger records now than before? Are any records emitted without abbreviations larger than with abbreviations?