This is an archive of the discontinued LLVM Phabricator instance.

[MC] Use subclass data for MCExpr to reduce memory usage
ClosedPublic

Authored by nikic on Apr 11 2020, 3:30 AM.

Details

Summary

MCExpr has a bunch of free space that is currently going to waste. Repurpose it as 24 bits of subclass data, which is enough to reduce the size of all subclasses by 8 bytes. This gives us some respectable savings for debuginfo builds. Here are the max-rss reductions for the fat LTO link step (full data):

kc.link 		238MiB 	231MiB (-2.82%)
sqlite3.link 		258MiB 	250MiB (-3.27%)
consumer-typeset.link 	152MiB 	148MiB (-2.51%)
bullet.link 		197MiB 	192MiB (-2.30%)
tramp3d-v4.link 	578MiB 	567MiB (-1.92%)
pairlocalalign.link 	92MiB 	 90MiB (-1.98%)
clamscan.link 		230MiB 	223MiB (-2.81%)
lencod.link 		242MiB 	235MiB (-2.67%)
SPASS.link 		235MiB 	230MiB (-2.23%)
7zip-benchmark.link 	450MiB 	435MiB (-3.25%)

Diff Detail

Event Timeline

nikic created this revision.Apr 11 2020, 3:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2020, 3:30 AM

I find it hard to determine who should review this, seems like all the major contributors to this code are no longer active (including the code owner of the MC layer...)

Thanks for the change! It is good. I only got some nits.

include/llvm/MC/MCExpr.h
333 ↗(On Diff #256759)

Nit: no space after (unsigned). No outer parentheses.

Alternatively, use static_cast<unsigned>

365 ↗(On Diff #256759)

Nit: no space after (VariantKind).

Alternatively, use static_cast<VariantKind>

433 ↗(On Diff #256759)

ditto

589 ↗(On Diff #256759)

ditto

nikic updated this revision to Diff 256794.Apr 11 2020, 12:30 PM
nikic marked 4 inline comments as done.

Fix coding style: Remove space after casts.

jsji added a comment.Apr 13 2020, 11:34 AM

Mostly good to me. Thanks.

include/llvm/MC/MCExpr.h
56 ↗(On Diff #256794)

Nit: Maybe move SubclassData to last parameter and default to 0?

57 ↗(On Diff #256794)

Call setter to validate data instead?

64 ↗(On Diff #256794)

Can we add one setter setSubclassData() as well? We can validate the length of data there too.

139 ↗(On Diff #256794)

Can we also define SizeInBytesBits ?

Can we avoid the magic number 8 here?

Also add some comments about how SizeInBytes and PrintInHexBit are encoded in SubclassData? eg: endian order?

143 ↗(On Diff #256794)

Should we avoid clobberring by SizeInBytes & SizeInBytesBits as well?

325 ↗(On Diff #256794)

Similar to above, maybe also define KindBits , avoid magic number 16 here , and comments.

nikic updated this revision to Diff 257095.Apr 13 2020, 1:29 PM
nikic marked 3 inline comments as done.
  • Make SubclassData last ctor parameter and optional.
  • Add assertion that SubclassData is not too large.
  • Replace various magic numbers with constants.
include/llvm/MC/MCExpr.h
57 ↗(On Diff #256794)

MCExprs are immutable, so I don't think we should add a setter (and we don't have them for other fields either). However, I've added an assertion in the constructor to check that the subclass data fits.

143 ↗(On Diff #256794)

The assertion here is intended to make sure that clobbering can't happen.

MaskRay accepted this revision.Apr 13 2020, 3:49 PM

LGTM.

This revision is now accepted and ready to land.Apr 13 2020, 3:49 PM
jsji accepted this revision.Apr 13 2020, 7:44 PM
jsji added inline comments.
include/llvm/MC/MCExpr.h
46 ↗(On Diff #257095)

Why we choose enum here?
Can we set it to const calculated from sizeof(unsigned int)?

143 ↗(On Diff #256794)

Yes, but how about non-assert build?

nikic marked an inline comment as done.Apr 14 2020, 1:00 AM
nikic added inline comments.
include/llvm/MC/MCExpr.h
46 ↗(On Diff #257095)

Why we choose enum here?

I'm actually not sure... I followed what I saw other similar code doing. My assumption was that some compilers do not allow using static const in bitfield widths (as they aren't "real" constants), but after some googling around I didn't find any evidence of that, so I'll switch this to use static const and go back to the enum if it breaks any bots.

Can we set it to const calculated from sizeof(unsigned int)?

Something like CHAR_BIT * (sizeof(unsigned int) - sizeof(uint8_t))? I'm not sure this is better, because it makes it non-obvious how much space is actually free to use.

jsji added inline comments.Apr 14 2020, 10:03 AM
include/llvm/MC/MCExpr.h
46 ↗(On Diff #257095)

Something like CHAR_BIT * (sizeof(unsigned int) - sizeof(uint8_t))? I'm not sure this is better, because it makes it non-obvious how much space is actually free to use.

OK, make sense too. Maybe add a comments to 24 == CHAR_BIT * (sizeof(unsigned int) - sizeof(ExprKind))?

This revision was automatically updated to reflect the committed changes.