This is an archive of the discontinued LLVM Phabricator instance.

ELF: Add support for short thunks on ARM.
ClosedPublic

Authored by pcc on Mar 27 2018, 3:39 PM.

Details

Summary

A short thunk uses a direct branch (b or b.w) instruction, and is used
when the target has the same thumbness as the thunk and is within
direct branch range (32MB for ARM, 16MB for Thumb-2). Reduces the
size of Chromium for Android's .text section by around 160KB.

Depends on D44962

Diff Detail

Repository
rL LLVM

Event Timeline

pcc created this revision.Mar 27 2018, 3:39 PM
ruiu added a comment.Mar 27 2018, 7:29 PM

Generally looking good.

lld/ELF/Thunks.cpp
67 ↗(On Diff #140015)

It would be nice to explain as a comment about why we have more than one thunk for ARM.

73–76 ↗(On Diff #140015)

Could you add comments for these new member functions?

234 ↗(On Diff #140015)

I'd add a comment saying that this function returns true if the target is not Thumb and is within 2^26.

Thanks for putting in the effort to doing this, especially updating the tests! This can also apply to AArch64 as well. The only concern that I have right now are whether it is possible to construct an example that oscillates between short and long thunks and would infinite loop if it were not for the iteration count. In general transitioning from a short thunk to a large thunk will increase the distance between sections, however there is one case where adding a thunk, or increasing the distance can shorten a distance.

OS1 : { *(some sections) }
OS2 0x2000000 : { *(some more sections) }

If in OS1 the ThunkSection size increases then the gap between the InputSections following it and OS2 will narrow, this might mean that Thunks in OS2 that target OS1 can now be written as short range, which then shrinks the distance between OS1 and the InputSections after the OS2 ThunkSection. I haven't got a test case that provokes this behaviour, neither have I convinced myself that it is impossible. I think that this case could be avoided by never transitioning a thunk from long to short, but I think it will only be worth doing if it is possible to make a failing test case.

pcc added a comment.Mar 28 2018, 1:31 PM

Thanks for putting in the effort to doing this, especially updating the tests! This can also apply to AArch64 as well. The only concern that I have right now are whether it is possible to construct an example that oscillates between short and long thunks and would infinite loop if it were not for the iteration count. In general transitioning from a short thunk to a large thunk will increase the distance between sections, however there is one case where adding a thunk, or increasing the distance can shorten a distance.

OS1 : { *(some sections) }
OS2 0x2000000 : { *(some more sections) }

If in OS1 the ThunkSection size increases then the gap between the InputSections following it and OS2 will narrow, this might mean that Thunks in OS2 that target OS1 can now be written as short range, which then shrinks the distance between OS1 and the InputSections after the OS2 ThunkSection. I haven't got a test case that provokes this behaviour, neither have I convinced myself that it is impossible. I think that this case could be avoided by never transitioning a thunk from long to short, but I think it will only be worth doing if it is possible to make a failing test case.

It seems to be possible. I was able to come up with this failing test case:

$ cat test.lds
SECTIONS {
  .foo : { *(.foo) }
  .bar 0x3000000 : { *(.bar) }
}
$ cat test.s
.section .foo,"ax",%progbits,unique,1
.zero 0x1000000

.section .foo,"ax",%progbits,unique,2
foo:
bl bar

.section .bar,"ax",%progbits,unique,1
bar:
bl foo
.zero 0x1000000
$ ../ra/bin/llvm-mc -triple armv7-unknown-gnu -filetype=obj -o test.o test.s; ../ra/bin/ld.lld test.o test.lds;objdump -d
../ra/bin/ld.lld: warning: lld uses blx instruction, no object with architecture supporting feature detected.
../ra/bin/ld.lld: warning: lld uses extended branch encoding, no object with architecture supporting feature detected.
../ra/bin/ld.lld: warning: lld may use movt/movw, no object with architecture supporting feature detected.
../ra/bin/ld.lld: error: thunk creation not converged

I also think this can be fixed by making sure that we cannot transition from long to short. I will implement that.

pcc updated this revision to Diff 140152.Mar 28 2018, 3:45 PM
pcc marked 3 inline comments as done.
  • Address review comments

Thanks for the update, I'm happy with the changes and the new test case.

ruiu accepted this revision.Mar 29 2018, 2:16 PM

LGTM

Thank you for writing the comment. That improved the code quality a lot.

lld/ELF/Thunks.cpp
85 ↗(On Diff #140152)

Should this be private to make it clear that only mayUseShortThunk() is a public interface?

272 ↗(On Diff #140152)

Could you write assignment and return in separate lines?

This revision is now accepted and ready to land.Mar 29 2018, 2:16 PM
This revision was automatically updated to reflect the committed changes.
pcc marked 2 inline comments as done.