This is an archive of the discontinued LLVM Phabricator instance.

[PPC64] Define getThunkSectionSpacing() based on the range of R_PPC64_REL24
ClosedPublic

Authored by MaskRay on May 9 2019, 1:58 AM.

Details

Summary

Suggested by Sean Fertile and Peter Smith.

Thunk section spacing decrease the total number of thunks. I measured a
decrease of 1% or less in some large programs, with no perceivable
slowdown in link time. Override getThunkSectionSpacing() to enable it.
0x2000000 is the farthest point R_PPC64_REL24 can reach. I tried several
numbers and found 0x2000000 works the best. Numbers near 0x2000000 work
as well but let's just use the simpler number.

As demonstrated by the updated tests, this essentially changes placement
of most thunks to the end of the output section. We leverage this
property to fix PR40740 reported by Alfredo Dal'Ava Júnior:

The output section .init consists of input sections from several object
files (crti.o crtbegin.o crtend.o crtn.o). Sections other than the last
one do not have a terminator. With this patch, we create the thunk after
the last .init input section and thus fix the issue. This is not
foolproof but works quite well for such sections (with no terminator) in
practice.

Event Timeline

MaskRay created this revision.May 9 2019, 1:58 AM
Herald added a project: Restricted Project. · View Herald Transcript

Number of thunks with difference choices of getThunkSectionSpacing:

A

0: 79692
0x2000000-0x10000: 78449
0x2000000-0xc000: 78336
0x2000000-0x8000: 78332
0x2000000-0x4000: 78386
0x2000000-0x2000: 78365
0x2000000-0x1000: 78345
0x2000000-0x0000: 78320
0x2000000+0x4000: 78334

B

0: 112315
0x2000000-0xc000: 112474
0x2000000-0x8000: 112333
0x2000000-0x4000: 111937
0x2000000-0x0000: 111838
0x2000000+0x4000: 111851
0x2000000+0x8000: 111847

The "simplest" number 0x2000000 works no worse than other numbers so I'll just use it.

MaskRay updated this revision to Diff 198779.May 9 2019, 3:05 AM
MaskRay edited the summary of this revision. (Show Details)

Update description

sfertile accepted this revision.May 9 2019, 8:24 AM

The "simplest" number 0x2000000 works no worse than other numbers so I'll just use it.

That is pretty reasonable. LGTM.

test/ELF/ppc64-call-reach.s
68

This comment is out of date and can be removed.

This revision is now accepted and ready to land.May 9 2019, 8:24 AM
MaskRay updated this revision to Diff 198847.May 9 2019, 9:06 AM
MaskRay marked an inline comment as done.

Delete a comment

I've checked this against our huge internal code base. This doesn't cause new failures.

ruiu accepted this revision.May 9 2019, 10:35 PM

LGTM

MaskRay updated this revision to Diff 198982.May 9 2019, 10:46 PM
MaskRay edited the summary of this revision. (Show Details)

Update descriptions

This revision was automatically updated to reflect the committed changes.