This is an archive of the discontinued LLVM Phabricator instance.

[LLD][ARM] Add initial support for Thumb for ARMv7A
ClosedPublic

Authored by peter.smith on Jun 10 2016, 4:48 AM.

Details

Summary

Add support for the R_ARM_THM relocations used in the objects present
in arm-linux-gnueabihf-gcc. These are:

  • R_ARM_THM_CALL
  • R_ARM_THM_JUMP11
  • R_ARM_THM_JUMP19
  • R_ARM_THM_JUMP24
  • R_ARM_THM_MOVT_ABS
  • R_ARM_THM_MOVW_ABS_NC

Interworking between ARM and Thumb is partially supported with BLX. The R_ARM_CALL relocation for ARM instructions and R_ARM_THM_CALL relocation for Thumb instructions will write out a BL or BLX depending on the state of the Target.

Assumptions:

  • Target processor has BLX instruction and extended range of Thumb 4-byte Branch instructions (true for ARMv7a).
  • In relocateOne if (Val & 0x1) == 1 target is Thumb, 0 is ARM. This will hold for objects that comply with the ABI for the ARM architecture.

This is sufficient interworking support for a Thumb hello world to work with a recent arm-linux-gnueabihf distribution such as Linaro GCC 5.3-2016.02).

Limitations:

  • No interworking thunks for R_ARM_JUMP24, R_ARM_THM_JUMP24, R_ARM_THM_JUMP19 and the deprecated R_ARM_PLT32 and R_ARM_PC24 instructions as the instructions cannot be written out as a BLX and need a state change thunk.
  • No range extension thunks. The R_ARM_JUMP24 and R_ARM_THM_CALL have a range of 16Mb
  • No regression test in the test suite for R_ARM_JUMP11, llvm-mc does not emit a relocation for B.N which is sensible and permissible under the ABI as the range of the branch is too small to be usefully relocated, however gnu-as does produce the relocation and there is at least one instance in the arm-linux-gnueabihf libraries so it may be present in input objects.

References:

I will look into interworking thunks next.

Diff Detail

Event Timeline

peter.smith retitled this revision from to [LLD][ARM] Add initial support for Thumb for ARMv7A.
peter.smith updated this object.
peter.smith added reviewers: ruiu, rafael, zatrazz.
peter.smith added a subscriber: rengolin.
ruiu added inline comments.Jun 10 2016, 10:58 AM
ELF/Target.cpp
1557

Target is Thumb. Use a BLX. ?

1558

'0' -> '1'?

1564

Remove else after break.

1567

It is not 0xeb but 0xeb000000, no? If so, please add a test to cover this path.

1596

I'd just write 1 instead of 0x1 (and so is 2 and 0x2)

1678–1679

This is probably a personal preference, but I'd name Hi and Lo because they are shorter.

peter.smith marked an inline comment as done.

Updated patch to address review comments:

  • Corrected ARM BL encoding.
  • Added test cases for ARM BLX to ARM and Thumb BLX to Thumb.
  • Some name and comment changes.

Thank you for the comments. I've updated with new revision.

ELF/Target.cpp
1557

I've gone for.
// If bit 0 of Val is 1 the target is Thumb, we must select a BLX.

1558

Yes in our case the bottom bit of Val is '1'. I've made the change.

I took imm24:H:0 from the Architecture reference manual as that is how the immediate is processed by the CPU.

1564

Ok, will remove.

1567

You are correct, it should be 0xeb000000. Apologies for missing this one and test cases. The BLX from Thumb to Thumb is also missing a test case, although the encoding is ok in that case.

I will add test cases to arm-blx.s and arm-thumb-blx.s that tests that a BLX from ARM state to ARM state results in a correctly encoded BL instruction.

1596

Ok, will make that change.

1678–1679

Changed to Hi and Lo, with only 80 columns and long expressions it makes sense.

ruiu added inline comments.Jun 13 2016, 3:05 PM
ELF/Target.cpp
1561

nit: indentation

1565

I think you want to check if the most significant byte is 0xFA. So this expression needs to be

(read32le(Loc) & 0xff000000) == 0xfa000000

(or equivalently ((read32le(Loc) >> 24) == 0xfa)

no?

1600

Can we use alignTo here?

Val = alignTo(Val, 4);
1695–1696

Format

Updated for review comments:

  • Alter mask when matching BLX instruction
  • Use alignTo instead of raw expression
  • clang-format run over patch
rafael edited edge metadata.Jun 14 2016, 5:26 AM

What code sequence causes gas to produce a R_ARM_THM_JUMP11? Maybe include a .o test with a comment about how it was created?

ELF/Target.cpp
1577

So the jump 11 target has one implicit 0 at the end but jump 19 has 2 (hence checkInt<12> and checkInt<21>)

1599

Not sure this part of the command ("It is equivalent to") adds a lot of value.

1675

This is the same as SignExtern64<11> without the shift, no?

1680

And this can use SignExtend64<19>?

test/ELF/Inputs/arm-thumb-blx-targets.s
11

Can you change the name of the sections to include the relocation it is testing?

rafael added inline comments.Jun 14 2016, 5:51 AM
ELF/Target.cpp
1472

So, are all of these really R_PLT_PC and not R_PC?

If the target symbol is not preemptible we don't create a plt entry and just use the target address. But if the target symbol is preemptible we will create a plt and that will have a call in it. That will break code that is not expecting the link register to change, no?

Trying to get phab to send email.

Hello,

I've managed to get emails from phabricator, will take a look straight away.

Thanks for the comments.

Peter

I've put some answers inline. Will post an update when I've written a test for R_ARM_JUMP11. I was under the impression that binary objects were frowned upon. Will aim to have an updated patch tomorrow.

To get an R_ARM_JUMP11 the narrow encoding of B needs to be forced with B.N.
.text
.syntax unified
.thumb
B.N external_symbol

ELF/Target.cpp
1472

The normal ARM PLT entries don't corrupt the link register. The lazy loading entry saves the link register on entry, to be restored by _dl_runtime_resolve. So it is safe to branch to a PLT entry.

Arguably R_ARM_THM_JUMP11 is not R_PLT_PC as a linker is not required to generate a PLT entry for it. The range on the branch is likely to be too short to reach the PLT entry so I think that could be split out.

At present interworking that requires thunks like a Thumb branch to an ARM PLT entry is broken as the B can't be changed to a BLX. I'm hoping to address this case with basic interworking thunks for a forthcoming review.

From ELF for the ARM Architecture:
R_ARM_CALL, R_ARM_THM_CALL, R_ARM_JUMP24, R_ARM_THM_JUMP24, R_ARM_THM_JUMP19 may be subject to PLT generation

R_ARM_PC24 and R_ARM_PLT32 are deprecated but still occur in the old library I was using to test the ARM port. These relocation types pre-date the BLX instructions so they can be used on both BL and B instructions hence the mapping to R_ARM_JUMP24. They were split into two relocation types R_ARM_CALL (unconditional BL and BLX) and R_ARM_JUMP24 (conditional BL and B).

1561

Thanks for spotting, now fixed.

1565

Not quite the most significant byte, although I think that 0xfa000000 == 0xfa000000 is not quite right

The BLX encoding is from Most Significant bit:
1111 101H (where H is part of the immediate so can be 0 or 1)
The BL encoding is from Most Significant bit:
cond 1011

Where cond is not 1111 (0xf is never). ARM reused the never space to encode more instructions.

Given that R_ARM_CALL can only be used on a BL or BLX and is correctly used by the object producer it would be sufficient to use 0xf0000000 == 0xf000000.

I have a weak preference to match as many of the fixed bits in the BLX instruction as possible as it helps match what is written in the architecture manual. This should make the condition 0xfe000000 == 0xfa000000. If you prefer 0xf0000000 == 0xf0000000 which is enough to discriminate BLX from BL, I'm happy to change.

1577

Yes. It is unfortunately not always easy to tell from the number in the relocation what overflow check is needed. My understanding, and distant memory from the time, was that the number represents the size of the immediate in the instruction.

From the ARM Architecture Reference Manual

Thumb B.N for (R_ARM_THM_JUMP11)
imm32 = SignExtend(imm11:’0’, 32);

Thumb B<cond>.W for (R_ARM_JUMP19)
imm32 = SignExtend(S:J2:J1:imm6:imm11:’0’, 32);

I make S:J2:J1:imm6:imm11 20 and not 19, but S could be considered sign and not part of the 19.

1599

I can remove it if it isn't helpful.

The reason I put it in was to avoid confusion with pseudo code for BLX in the ARM Architectural reference manual:
if targetInstrSet == InstrSet_ARM then
targetAddress = Align(PC,4) + imm32;

Where Align(PC, 4) is rounding down and not up like AlignTo.

1600

Yes we can. Will use that in future wherever I can.

1675

Yes I think you're right there. Will test to make sure and change.

1680

Yes I think so, with the appropriate modifications to the shifts.

1695–1696

I've run clang-format on the file and used its indentation.

test/ELF/Inputs/arm-thumb-blx-targets.s
11

Yes, will make the change.

ruiu accepted this revision.Jun 14 2016, 9:15 AM
ruiu edited edge metadata.

Almost looking good. A few nits.

ELF/Target.cpp
1565

0xfe000000 == 0xfa000000 seems fine to me.

1569

nit: indent with two more spaces.

1599

I'd remove that part "It is equivalent to..." since the code is now obvious with alignTo().

1604

nit: two more spaces.

This revision is now accepted and ready to land.Jun 14 2016, 9:15 AM
peter.smith edited edge metadata.

Added patch to address Rafael and Rui's last comments:

  • Added test case for R_ARM_THM_JUMP11. I will add a binary version of the Inputs/arm-thumb-narrow-branch.s assembled by the GNU assembler as well as the source.
  • Formatting and comment clean ups.
  • R_ARM_THM_JUMP11 uses R_PC not R_PLT_PC as a linker does not have to generate a PLT entry for it according to ELF for the ARM architecture.
  • I have not changed the SignExtend for R_ARM_THM_JUMP11 and R_ARM_JUMP19 as the shift is important to the correctness of the expression, the shift can be moved outside the SignExtend, but I think it is a closer mapping to the architecture description to have the shift inside the SignExtend.

I'll wait till at least tomorrow morning for any more comments.

rafael accepted this revision.Jun 15 2016, 11:59 AM
rafael edited edge metadata.

LGTM too.

peter.smith closed this revision.Jun 16 2016, 3:32 AM

Thank you.

Committed revision 272881

ELF/Target.cpp
1569

Ok done. The original is where clang-format puts it, but I prefer the indented form as well.

1599

Ok, I've removed that part.

1604

Ok done. The original is where clang-format puts it, but I prefer the indented form as well.

1675

Turns out that it isn't quite the same as the shift is needed to multiply by two.
It is the same as (SignExtend64<11>(read16le(Buf) & 0x07ff) << 1) but I'm not sure it is worth making that change so I've left it as it is.

1680

As with R_ARM_JUMP11 the shift << 1 can be moved outside the SignExtend64<19> like so:
(SignExtend64<19>(((Hi & 0x0400) << 9) | // S

((Lo & 0x0800) << 7) |  // J2
((Lo & 0x2000) << 4) |  // J1
((Hi & 0x003f) << 11) | // imm6
(Lo & 0x07ff)) << 1

I've not made this change as I think the original is easier to relate to the instruction description in the reference manual.