Without using a fixup in this case, BL will be used instead of BLX to
call internal ARM functions from Thumb functions.
Details
Diff Detail
Event Timeline
One inline question, looks ok to me otherwise.
-eric
test/MC/ARM/thumb-far-jump.s | ||
---|---|---|
25 ↗ | (On Diff #99879) | I think this is right, but I'm curious how we had it precisely inverted. What's up? |
I've added a couple of inline comments and I hope I can answer the question about whether the relocations are harmless.
The ABI only permits a linker to do interworking (such as changing a BL to BLX) for a relocation if the symbol has type STT_FUNC and the relocation is one of R_ARM_CALL, R_ARM_JUMP24, R_ARM_THM_CALL, R_ARM_THM_JUMP22, R_ARM_THM_JUMP19, which match ARM BL/BLX, ARM B, Thumb BL/BLX, Thumb B.w, Thumb B.w<cc>. The narrow 16-bit Thumb branch instructions are excluded. The linker should give an error messages if it detects that an ARM/Thumb state change is required outside of these conditions. So I think that adding a relocation wouldn't make correct code generated by the assembler incorrect, and it would give the linker a chance to detect the error. I note that for ARM state callers we just mark all the branch type relocations as IsResolved=false, and I suggest we do the same for wide Thumb branch type relocations as well.
If there is a BL/BLX to an internal label that doesn't have type STT_FUNC (and may not have the thumb bit set) then the linker is not supposed to try and do interworking. According to the ABI it is the programmers responsibility to be in the correct ARM/Thumb state before writing the instruction in this case. I think the ideal behaviour would be to issue an error message, but we could rely on the linker to do it.
As an aside the intent behind the wording in the ABI was to expose as many branch relocations to the linker as possible as there was existing functionality in embedded tool chains to do things like substitute functions (for example ARM $Sub$$, and the ld symbol wrapping facility). It was thought at the time that the trade off of more information and a potentially simpler assembler was worth having more work to do at link time was worthwhile. The ABI doesn't forbid toolchains from resolving relocations at assembly time, but it does put the burden of doing correct interworking (or issuing an error when it isn't possible) if the relocation is resolved at assembly time.
lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp | ||
---|---|---|
732 | I don't think that this is complete enough as it doesn't handle the wide unconditional b.w and wide conditional b<cc> cases. The ABI [*] requires the linker to do interworking for these cases. It is rare to to see b<cc> as a function call requiring interworking but b.w is often used in tail calls. For example if you add to your new test case in thumb_caller b.w internal_arm_fn b.w global_thumb_fn b.w global_arm_fn beq.w internal_arm_fn beq.w global_arm_fn beq.w global_thumb_fn You'll get no relocations for any of these cases. There is no way the assembler can fix up the b.w and beq.w as there is immediate version of bx that can be substituted like blx. The only solution is a linker generated thunk/stub/veneer or an error message if the ARM target is mismatched.
| |
741 | In ARM state this bit of code marks all BL, BLX, B and B<cc> as IsResolved=false so all these cases are handled correctly by the linker from ARM state. I recommend that we do the same thing for Thumb wide branch instructions. |
test/MC/ARM/thumb-far-jump.s | ||
---|---|---|
25 ↗ | (On Diff #99879) | Without the changes in this patch, no relocations are generated for branches to internal labels that are in range of the branch, which was the case for the branch to end. That means BL is used for all such branches, even for branches from Thumb to ARM code. With this patch, a relocation is always generated when jumping to non Thumb code. But currently isThumbFunc only tracks function symbols, while end is a non-function label, meaning a relocation is added even though it is not really necessary because in this case we are branch from Thumb code to Thumb code. As Peter explained, additional relocations for non-function labels shouldn't cause problems, as long as there's no change in execution mode. And unless we are dealing with hand-written assembler that violates the ABI, the execution mode should only be changed by branches to function symbols AFAIK. |
test/MC/ARM/thumb-far-jump.s | ||
---|---|---|
25 ↗ | (On Diff #99879) | Seems reasonable. At any rate I trust you and Peter to figure this one out among you :) |
Updated patch to only emit relocations for branches to ELF function symbols. It seems like ARM::fixup_arm_thumb_bl isn't used for MachO binaries and I am not sure how/if we should deal with Windows binaries?
In general that looks good to me. I think it would be best not to hoist the check for STT_FUNC and STT_GNU_IFUNC into MC as this is common to all Targets and STT_GNU_IFUNC is defined in the STT_LOOS (OS specific range) and hence could mean something different on another platform/target.
lib/MC/MCSymbolELF.cpp | ||
---|---|---|
197 | I think that we should be careful in putting this in MCSymbolELF. The STT_FUNC is generic ELF and common to all Targets but STT_GNU_IFUNC is defined in the OS specific range [STT_LOOS, STT_HIOS] so there is an outside chance that it is defined to something else on a non ARM Target. |
Addressed Peter's comment and removed hoisting the check for STT_FUNC and STT_GNU_IFUNC into MCSymbolElf.
LGTM; I'm happy with that. It will be worth waiting a bit to make sure there are no other objections from those in different timezones.
I think that we should be careful in putting this in MCSymbolELF. The STT_FUNC is generic ELF and common to all Targets but STT_GNU_IFUNC is defined in the OS specific range [STT_LOOS, STT_HIOS] so there is an outside chance that it is defined to something else on a non ARM Target.