This is an archive of the discontinued LLVM Phabricator instance.

[lld][ARM] don't use short thumb thunks if no branch range extension
ClosedPublic

Authored by stuij on Dec 23 2022, 9:50 AM.

Details

Summary

In ThumbThunk::isCompatibleWith, we check if we can use short thunks if we are
within branch range. However these short thumb thunks will generate b.w
instructions, and these are not available on pre branch range extension
architectures.

On these architectures (v4, v5, and most of v6), we could replace the b.w with a
Thumb b (2) instruction, but that would in an ideal situation only give us an
extra range of 2048 bytes on top of the 4MB range of a BL, if a thunk section
happens to be placed on the outer range of a BL and the stars are aligned. It
doesn't seem worth it.

What would be worth it is a state change to Arm and a subsequent branch to
either Arm or Thumb code. But that's the subject of another patch.

Diff Detail

Event Timeline

stuij created this revision.Dec 23 2022, 9:50 AM
Herald added a project: Restricted Project. · View Herald Transcript
stuij requested review of this revision.Dec 23 2022, 9:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 23 2022, 9:50 AM

On these architectures (v4, v5, and most of v6), we could replace the b.w with a Thumb b (2) instruction, but that would in an ideal situation only give us an extra range of 2048 bytes on top of the 4MB range of a BL,

Q: what is "Thumb b (2)" ?

if a thunk section happens to be placed on the outer range of a BL and the stars are aligned.

stars?

lld/ELF/Thunks.cpp
547
550
lld/test/ELF/arm-thumb-range-thunk-os-no-ext.s
4

The output file has 12+MiB, which is too large. Is it possible to make it smaller with a linker script?

16

/// for non-RUN-non-CHECK comments in new tests.

23

End full sentences with a period.

On these architectures (v4, v5, and most of v6), we could replace the b.w with a Thumb b (2) instruction, but that would in an ideal situation only give us an extra range of 2048 bytes on top of the 4MB range of a BL,

Q: what is "Thumb b (2)" ?

Its version 2 of the Thumb b instruction as per the v4-v6 Arm Architecture Reference Manual (doc nr: ddi 0100i).
So section A7.1.14 of https://documentation-service.arm.com/static/5f8dacc8f86e16515cdb865a

if a thunk section happens to be placed on the outer range of a BL and the stars are aligned.

stars?

'stars are aligned' figure of speech embellishing the notion that it's very unlikely. That part didn't include any actual information.

stuij added inline comments.Dec 23 2022, 1:02 PM
lld/test/ELF/arm-thumb-range-thunk-os-no-ext.s
4

I agree it's not great. I used arm-thumb-range-thunk-os.s as a template which is more than twice as big btw. I'll have a look what I can do.

stuij added inline comments.Dec 23 2022, 1:11 PM
lld/test/ELF/arm-thumb-range-thunk-os-no-ext.s
4

I think an or *the* issue is that the linker will behave differently when using linker script sections, because then the ThunkSection will be placed immediately after the code, which isn't what you want in this case.

On these architectures (v4, v5, and most of v6), we could replace the b.w with a Thumb b (2) instruction, but that would in an ideal situation only give us an extra range of 2048 bytes on top of the 4MB range of a BL,

Q: what is "Thumb b (2)" ?

Its version 2 of the Thumb b instruction as per the v4-v6 Arm Architecture Reference Manual (doc nr: ddi 0100i).
So section A7.1.14 of https://documentation-service.arm.com/static/5f8dacc8f86e16515cdb865a

Thanks for the manual. It uses "b (2)" which resolves my question. Where does "extra range of 2048 bytes" come from?

if a thunk section happens to be placed on the outer range of a BL and the stars are aligned.

stars?

'stars are aligned' figure of speech embellishing the notion that it's very unlikely. That part didn't include any actual information.

Thanks! I did not know "stars are aligned" and when I saw "aligned" I was thinking of what objects were aligned...

lld/test/ELF/arm-thumb-range-thunk-os-no-ext.s
4

Can you elaborate?

stuij marked an inline comment as done.Jan 5 2023, 8:58 AM

Thanks for the manual. It uses "b (2)" which resolves my question. Where does "extra range of 2048 bytes" come from?

The maximum range of a BL is 4MB. So of course the max a ThunkSection can be placed from this BL is also 4MB. At this point we could use a short Thunk which for Thumb on v4/5/6 is a maximum of 2048 bytes (our friend b (2). But this seems like an unlikely scenario as the ThunkSection would have to be placed exactly 4MB away from the call site, or at least within 2048 bytes from 4MB away to have any effect.

lld/test/ELF/arm-thumb-range-thunk-os-no-ext.s
4

Sure. The os in arm-thumb-range-thunk-os-no-ext.s stands for OutputSection. The goal of the original v7 test is to test range extension thunks in a single section. If you use different sections via a linker script you're not testing how range extension thunks behave in a single section.

Granted, this test setup is definitely overkill for this particular issue, but we're not testing range extension thunks yet for v4/5/6. And it seemed a fitting place to also test that we're doing the correct thing here as well, so I figured why not go the extra mile.

MaskRay added inline comments.Jan 5 2023, 10:26 AM
lld/test/ELF/arm-thumb-range-thunk-os-no-ext.s
2

remove excess space before -arm-add-build-attributes

use .o for relocatable files.

some older tests don't use convention, but try fixing them for new tests.

4

OK, I think the 12MiB file is fine. Change the non-RUN-non-CHECK comments to use /// , though.

stuij updated this revision to Diff 486796.Jan 6 2023, 3:19 AM
stuij marked 7 inline comments as done.

addressed review comments

stuij marked an inline comment as done.Jan 6 2023, 3:20 AM
MaskRay accepted this revision.Jan 6 2023, 12:37 PM
This revision is now accepted and ready to land.Jan 6 2023, 12:37 PM