This is an archive of the discontinued LLVM Phabricator instance.

LowerBitSets: Align referenced globals.
ClosedPublic

Authored by pcc on Feb 24 2015, 4:08 PM.

Details

Summary

This change aligns globals to the next highest power of 2 bytes, up to a
maximum of 128. This makes it more likely that we will be able to compress
bit sets with a greater alignment. In many more cases, we can now take
advantage of a new optimization also introduced in this patch that removes
bit set checks if the bit set is all ones.

The 128 byte maximum was found to provide the best tradeoff between instruction
overhead and data overhead in a recent build of Chromium. It allows us to
remove ~900KB of instructions at the cost of ~250KB of data.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc updated this revision to Diff 20635.Feb 24 2015, 4:08 PM
pcc retitled this revision from to LowerBitSets: Align referenced globals..
pcc updated this object.
pcc edited the test plan for this revision. (Show Details)
pcc added reviewers: kcc, jfb.
pcc added a subscriber: Unknown Object (MLST).
kcc edited edge metadata.Feb 24 2015, 4:39 PM

~900KB of instructions

Out of how many?

Depending on the %% savings we may or may not want the extra complexity.

pcc added a comment.Feb 24 2015, 4:56 PM

This change reduced the size of the .text section from 6701377 bytes to 5844340 bytes in a binary based on Chrome but with everything except bit set checks removed. So around 15%. I'll also collect figures for Chrome itself overnight.

kcc added a comment.Feb 24 2015, 5:56 PM

This also deserves to be mentioned in the design document clang.llvm.org/docs/ControlFlowIntegrityDesign.html
(probably create a section "Optimisations")
Since this is a security feature, we want to make sure that the security researchers who compare
the design document with the actual generated code are not surprised.

pcc added a comment.Feb 25 2015, 11:36 AM

Size info for chrome (before, after):

out_cfi_0bf03cb47380bf9bea4e738b7b5f2ded56eb335b/Release/chrome  :
section                       size        addr
.interp                         28         624
.note.ABI-tag                   32         652
.note.gnu.build-id              36         684
.dynsym                      44688         720
.dynstr                      38634       45408
.gnu.hash                      380       84048
.gnu.version                  3724       84428
.gnu.version_r                 960       88152
.rela.dyn                 13011840       89112
.rela.plt                    41232    13100952
.init                           26    13142184
.plt                         27504    13142224
.text                    112043598    13169728
malloc_hook                   4304   125213328
google_malloc                 4769   125217632
.fini                            9   125222404
.rodata                   12601040   125222656
.eh_frame                  7829036   137823696
.eh_frame_hdr              1680676   145652732
.tbss                           16   147341264
.data.rel.ro.local         1927376   147341264
.jcr                             8   149268640
.fini_array                      8   149268648
.init_array                     56   149268656
.data.rel.ro               3470504   149268720
.dynamic                      1152   152739224
.got                          2072   152740376
.got.plt                     13768   152742448
.tm_clone_table                  0   152756224
.data                       153243   152756224
.bss                        444996   152909472
.comment                        92           0
.note.gnu.gold-version          28           0
Total                    153345835


out_cfi_a0ad911453217da8f0d553bc9be5f6162bca9ab8/Release/chrome  :
section                       size        addr
.interp                         28         624
.note.ABI-tag                   32         652
.note.gnu.build-id              36         684
.dynsym                      44688         720
.dynstr                      38634       45408
.gnu.hash                      380       84048
.gnu.version                  3724       84428
.gnu.version_r                 960       88152
.rela.dyn                 13011840       89112
.rela.plt                    41232    13100952
.init                           26    13142184
.plt                         27504    13142224
.text                    109658814    13169728
malloc_hook                   4304   122828544
google_malloc                 4759   122832848
.fini                            9   122837608
.rodata                   12349136   122837760
.eh_frame                  7812260   135186896
.eh_frame_hdr              1680676   142999156
.tbss                           16   144687376
.data.rel.ro.local         2010512   144687376
.jcr                             8   146697888
.fini_array                      8   146697896
.init_array                     56   146697904
.data.rel.ro               3878568   146697968
.dynamic                      1152   150576536
.got                          2072   150577688
.got.plt                     13768   150579760
.tm_clone_table                  0   150593536
.data                       153243   150593536
.bss                        444996   150746784
.comment                        92           0
.note.gnu.gold-version          28           0
Total                    151183561

So over 2MB savings overall, better than I expected!

kcc accepted this revision.Feb 25 2015, 12:12 PM
kcc edited edge metadata.

LGTM

This revision is now accepted and ready to land.Feb 25 2015, 12:12 PM
This revision was automatically updated to reflect the committed changes.
pcc added a comment.Feb 25 2015, 4:20 PM

This also deserves to be mentioned in the design document clang.llvm.org/docs/ControlFlowIntegrityDesign.html

r230588.