This is an archive of the discontinued LLVM Phabricator instance.

[MC][ARM] Resolve some pcrel fixups at assembly time
ClosedPublic

Authored by MaskRay on Jan 16 2020, 4:52 PM.

Details

Summary

MC currently does not emit these relocation types, and lld does not
handle them. Add FKF_Constant as a work-around of some ARM code after
D72197. Eventually we probably should implement these relocation types.

Diff Detail

Event Timeline

MaskRay created this revision.Jan 16 2020, 4:52 PM

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: console-log.txt

pcc added a comment.Jan 16 2020, 5:09 PM

Shouldn't we just implement support for the relocation instead?

MaskRay updated this revision to Diff 238673.Jan 16 2020, 5:50 PM
MaskRay retitled this revision from [MC][ARM] Resolve fixup_{arm,t2}_adr_pcrel_12 at assembly time to [MC][ARM] Resolve some pcrel fixups at assembly time.
MaskRay edited the summary of this revision. (Show Details)

Add more fixup kinds

MaskRay updated this revision to Diff 238675.Jan 16 2020, 5:53 PM

Add a FIXME

MaskRay added a comment.EditedJan 16 2020, 5:55 PM
In D72892#1825574, @pcc wrote:

Shouldn't we just implement support for the relocation instead?

I am not familiar with ARM/Thumb. From reading the tests, it looks these instructions usually work for local symbols.

When used with global symbols, I agree that we should eventually support the relocation types in MC and lld. Not sure if that is easy, so created this patch to address your immediate problem.

ARM Big Endian is not changed.

pcc added a comment.Jan 16 2020, 6:17 PM

If we just want a quick fix then I would prefer if we consider strong hidden symbols to be non-preemptible. This just seems like it would cause our assembler to hide bugs of the sort that D72197 was intended to catch, while changing the behavior for hidden symbols would mean that we continue to catch those bugs.

Unit tests: pass. 61935 tests passed, 0 failed and 783 were skipped.

clang-tidy: unknown.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Unit tests: fail. 61934 tests passed, 1 failed and 783 were skipped.

failed: libc++.std/thread/thread_mutex/thread_mutex_requirements/thread_mutex_requirements_mutex/thread_mutex_class/try_lock.pass.cpp

clang-tidy: unknown.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

peter.smith added a comment.EditedJan 17 2020, 1:35 AM

As I understand it these relocation types are not often implemented is due to their short range. The Arm and Thumb2 ldr (literal) range (search for A8.8.64 in https://static.docs.arm.com/ddi0406/c/DDI0406C_C_arm_architecture_reference_manual.pdf) is:

Encoding T2 or A1 Any value in the range -4095 to 4095.

Outside of carefully crafted section orders, the chances of the relocation being in range if it went outside of a section boundary is very low. If the relocation is within a section boundary then the assembler can resolve it.

I don't think that there is anyway the instruction and the other similar small immediate relocations could be dynamically relocated to a preempted definition. There are certainly no dynamic relocations defined for them.

I can't think of a case where a compiler would be at risk of using one of these instructions for a preemptible definition. Perhaps taking the address of a function in the same section, although I'd hope that this was prevented with -fpic. Unfortunately with assembler we rely on the author to not write code that is pic and don't really have a way of diagnosing it. Emitting a relocation would permit the linker to diagnose as it would know that the symbol was being used in a preemptible context, but this might break some existing code if it were an error.

peter.smith added a comment.EditedJan 17 2020, 6:30 AM

Personally I'd prefer we either go for a permanent decision to always resolve some fixups without relocations, or just implement the relocations in LLD when they are missing, assuming they are also in ld.bfd. It is likely that ld.gold won't have implemented the relocations either but we may be less concerned about that.

Happy to submit a patch to implement missing relocations in LLD for Arm, on the initial implementation I concentrated on what llvm-mc could output. Initial test can use yaml2elf.

MaskRay updated this revision to Diff 245792.Feb 20 2020, 10:32 PM

Rebase.

In case someone needs this workaround before we implement these relocations.

tpimh added a subscriber: tpimh.Feb 21 2020, 12:04 AM
psmith accepted this revision.Feb 21 2020, 3:43 AM
psmith added a subscriber: psmith.

Rebase.

In case someone needs this workaround before we implement these relocations.

Given how close we are to 10.0 and pr44929 is on the 10.0 release blocker list. I think this is workaround is closer than emitting the relocations and updating the linkers. Will be worth waiting to see if there are any objections.

I think that at least a subset of this will be beneficial as there are some fixups that don't have relocations defined. Checking the Arm ARM https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile and https://static.docs.arm.com/ihi0044/g/aaelf32.pdf as well as the binutils implementations it looks like we have relocations defined for:

Arm state

instructionfixuprelocationgoldbfd
adr r0, bararm_adr_pcrel_12R_ARM_ALU_PC_G0_NCyesyes
ldr r0, fooarm_ldst_pcrel_12R_ARM_LDR_PC_G0yesyes
ldrd r0, r1, fooarm_pcrel_10_unscaledR_ARM_LDRS_PC_G0*no*yes

There are G1, G2 and G3 relocations as well as the add and ldr have a modified immediate form that shifts the immediate. In GNU as these can be accessed via modifiers such as :pc_g1: symbol.

Thumb state

instructionfixuprelocationgoldbfd
adr r0, barfixup_thumb_adr_pcrel_10R_ARM_THM_PC8yesyes
adr.w r0, bart2_adr_pcrel_12R_ARM_THM_ALU_PREL_11_0yesyes
ldr r0, foofixup_arm_thumb_cpR_ARM_THM_PC8yesyes
ldr.w r0, foot2_ldst_pcrel_12R_ARM_THM_PC12yesyes

I'm not aware of any relocations for vldr or the thumb state ldrd. I'm not sure I've ever seen these written in assembly though.

I'll start implementing these in LLD, will take a little time as I can't work full time on it. When that is done we can emit relocations for these cases. We'll then have to decide whether to support vldr or ldrd to a global. These may be sufficiently rare to ask for code to be changed.

This revision is now accepted and ready to land.Feb 21 2020, 3:43 AM
MaskRay added a comment.EditedFeb 23 2020, 2:45 PM

Rebase.

In case someone needs this workaround before we implement these relocations.

Given how close we are to 10.0 and pr44929 is on the 10.0 release blocker list. I think this is workaround is closer than emitting the relocations and updating the linkers. Will be worth waiting to see if there are any objections.

I think that at least a subset of this will be beneficial as there are some fixups that don't have relocations defined. Checking the Arm ARM https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile and https://static.docs.arm.com/ihi0044/g/aaelf32.pdf as well as the binutils implementations it looks like we have relocations defined for:

Arm state

instructionfixuprelocationgoldbfd
adr r0, bararm_adr_pcrel_12R_ARM_ALU_PC_G0_NCyesyes
ldr r0, fooarm_ldst_pcrel_12R_ARM_LDR_PC_G0yesyes
ldrd r0, r1, fooarm_pcrel_10_unscaledR_ARM_LDRS_PC_G0*no*yes

There are G1, G2 and G3 relocations as well as the add and ldr have a modified immediate form that shifts the immediate. In GNU as these can be accessed via modifiers such as :pc_g1: symbol.

Thumb state

instructionfixuprelocationgoldbfd
adr r0, barfixup_thumb_adr_pcrel_10R_ARM_THM_PC8yesyes
adr.w r0, bart2_adr_pcrel_12R_ARM_THM_ALU_PREL_11_0yesyes
ldr r0, foofixup_arm_thumb_cpR_ARM_THM_PC8yesyes
ldr.w r0, foot2_ldst_pcrel_12R_ARM_THM_PC12yesyes

I'm not aware of any relocations for vldr or the thumb state ldrd. I'm not sure I've ever seen these written in assembly though.

I'll start implementing these in LLD, will take a little time as I can't work full time on it. When that is done we can emit relocations for these cases. We'll then have to decide whether to support vldr or ldrd to a global. These may be sufficiently rare to ask for code to be changed.

I discussed with FreeBSD developers on #bsdmips. Hope they can make the assembly code (https://github.com/freebsd/freebsd/blob/master/sys/arm/arm/locore-v6.S#L75) more portable, without any work on LLVM's side.

For some relocation types (e.g. R_ARM_ALU_PC_G0_NC), they were only touched twice in binutils-gdb/bfd/, with the last commit made in 2006 (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=4962c51a67b10abe404f59ba70d8bc51473fbc16). I can't find a ld/testsuite test demonstrating that it works. Implementing them just makes very rare code constructs like .globl foo; foo: ... adr r0, foo work.

If it turns out that more instances run into this issue. I think we can commit the workaround just to the release/10.x branch.

@psmith I can't convince myself that they are useful enough deserving your precious time implementing them :)

bsdimp added a subscriber: bsdimp.EditedFeb 24 2020, 4:38 PM

I discussed with FreeBSD developers on #bsdmips. Hope they can make the assembly code (https://github.com/freebsd/freebsd/blob/master/sys/arm/arm/locore-v6.S#L75) more portable, without any work on LLVM's side.

I wouldn't characterize the conversation like that. There was substantial pushback from the FreeBSD arm developer community to llvm/clang not accepting assembler syntax that's accept by (a) prior versions of clang and (b) binutils absent a very good reason. No such very good reason has been proffered to date.

MaskRay closed this revision.Feb 27 2020, 9:23 AM

I intended this to be committed to release/10.x only, but it is now also in master... commit 2e24219d3cbfcb8c824c58872f97de0a2e94a7c8

I discussed with FreeBSD developers on #bsdmips. Hope they can make the assembly code (https://github.com/freebsd/freebsd/blob/master/sys/arm/arm/locore-v6.S#L75) more portable, without any work on LLVM's side.

I wouldn't characterize the conversation like that. There was substantial pushback from the FreeBSD arm developer community to llvm/clang not accepting assembler syntax that's accept by (a) prior versions of clang and (b) binutils absent a very good reason. No such very good reason has been proffered to date.

@bsdimp I am of metaqwe's opinion: .globl foo; foo: adr r0, foo may be loosely considered an "undefined behavior" in assembly.

metaqwe | wow, i wonder if this qualifies as a category of undefined behavior in ARM that is only coming out now

I am not convinced that this was intentionally supported by another assembler. I am still feeling sorry if FreeBSD does not have intention to make code more portable.

I intended this to be committed to release/10.x only, but it is now also in master... commit 2e24219d3cbfcb8c824c58872f97de0a2e94a7c8

I'll rebase https://reviews.llvm.org/D75039 on top of this for master, now that the LLD part has been approved https://reviews.llvm.org/D75042 , this will emit the relocations for the Thumb case so that the linker can handle them. I've got an implementation and tests for the Arm relocations that I'd hoped to post upstream today but I've unfortunately been too distracted to get it done.

hmm, I discussed with MC developer on #bsdmips :). Hope he can make MC more conform to widely accepted standards, without issuing more and more regressions, and to understand that:

  1. The code in locore_v6.S is absolutely valid. It is used in strictly statically linked environment to fixed address, so every reference to interposed symbol handling is absurd.
  2. MC cannot, in any case, refuse to compile valid code only because same code can be invalid in other environment.
  3. Situation when new version of given program fails for valid sources, but previous version worked as expected, is called regression.
  4. Under standard open source policy, the regression is always release blocker and it cannot be resolved as won’t fix.
  5. Marking regression as won’t fix is synonym for “f**k with users”.

Just look, I fully understand that symbol interposition is “problematic” in these cases. However you cannot, in any case, expect that given assembly unit will be used in environment where symbol interposition is allowed. Or worse, you can't expect that given symbol will be interposed. All you can do is to issue a warning if there is a high probability that the problem may occur. Or issue an error if the problem has already occurred. Nothing more.

I apologize for this rant talk, but this issue makes me really angry. Imho, this is exact reason why MC have very bad reputation in arm community, why is very problematic to pass patch to upstream with "this code is not compiled by MC but is valid for all others assemblers".
And like always - we (FreeBSD) cannot compile more than 2/3 of ports with assembler files just because MC doesn't support pre-UAL syntax and upstream doesn't want to switch to UAL. This is very frustrating situation.

With luck after the following couple of patches for Arm state, we'll output relocations for these fixups when there is a global symbol, and as a bonus get to support cross-section references if they happen to be within the range of the relocations. At link time it will be possible to determine whether a symbol is interposed and possibly warn or error for problem cases in shared objects while accepting cases in executables.
https://reviews.llvm.org/D75347 (MC)
https://reviews.llvm.org/D75349 (LLD)
My apologies for any problems caused, I do want to have adr and ldr to global symbols be allowed for executables.

MaskRay added a comment.EditedFeb 28 2020, 9:48 AM

@strejda Apologies for any problems caused, but I am afraid I cannot agree with all the points.

hmm, I discussed with MC developer on #bsdmips :). Hope he can make MC more conform to widely accepted standards, without issuing more and more regressions, and to understand that:

  1. The code in locore_v6.S is absolutely valid. It is used in strictly statically linked environment to fixed address, so every reference to interposed symbol handling is absurd.

adr r0, foo When foo is STB_LOCAL, this is true. When foo is STB_GLOBAL, I have to say it is fuzzy.
If it is convenient for you, please estimate how much code uses STB_GLOBAL. It will be more persuasive than simply calling it "a regression". If a code problem is widespread, then toolchain may still have to work around them.

  1. MC cannot, in any case, refuse to compile valid code only because same code can be invalid in other environment.

There is a widely agreed rule: a direct access (non-GOT non-PLT) relocation referencing a STV_DEFAULT STB_GLOBAL symbol is not allowed. Deviating from it is considered incorrect.
In this context, GNU as does not produce these short range relocations. Instead (inconsistent with other ports of GNU as), such fixups are resolved at assembly time.
Such mistakes can be easily made: binutils code does not have a good abstraction layer for relocation types on different architectures. It is easy to omit something for some relocation types.
For example, the PowerPC arch maintainer has admitted R_PPC_ADDR16_LO is a problem, though they will probably refuse fixing the problem.
I don't have access to an ARM assembler and don't know how it behaves.

  1. Situation when new version of given program fails for valid sources, but previous version worked as expected, is called regression.

It can be an undefined behavior in metaqwe's words. clang can be loose and incorrectly allow some code constructs while a future release may disallow them.

  1. Under standard open source policy, the regression is always release blocker and it cannot be resolved as won’t fix.

Every regression should be carefully treated, but for this case it is hard to say this issue is a regression.

  1. Marking regression as won’t fix is synonym for “f**k with users”.

Just look, I fully understand that symbol interposition is “problematic” in these cases. However you cannot, in any case, expect that given assembly unit will be used in environment where symbol interposition is allowed. Or worse, you can't expect that given symbol will be interposed. All you can do is to issue a warning if there is a high probability that the problem may occur. Or issue an error if the problem has already occurred. Nothing more.

I apologize for this rant talk, but this issue makes me really angry. Imho, this is exact reason why MC have very bad reputation in arm community, why is very problematic to pass patch to upstream with "this code is not compiled by MC but is valid for all others assemblers".
And like always - we (FreeBSD) cannot compile more than 2/3 of ports with assembler files just because MC doesn't support pre-UAL syntax and upstream doesn't want to switch to UAL. This is very frustrating situation.

The other ARM assembly issues are unrelated to this particular short range relocation issue. I have the viewpoint that we should be careful accepting new features. It is not uncommon to bring legacy stuff into a project which does not play well with other things. Accepting legacy stuff into a compiler/assembler/linker may be easy to paper over some code issues, but it may not be healthy for the project in the long term.

For this case, Peter kindly offers help implementing these relocation types. They will make the overall assembler+linker model sane. Though, I am nervous about the complexity. I don't know enough of ARM and I don't know how to estimate the code which actually uses such short range relocations referencing STB_GLOBAL symbols.

@strejda Apologies for any problems caused, but I am afraid I cannot agree with all the points.

hmm, I discussed with MC developer on #bsdmips :). Hope he can make MC more conform to widely accepted standards, without issuing more and more regressions, and to understand that:

  1. The code in locore_v6.S is absolutely valid. It is used in strictly statically linked environment to fixed address, so every reference to interposed symbol handling is absurd.

adr r0, foo When foo is STB_LOCAL, this is true. When foo is STB_GLOBAL, I have to say it is fuzzy.

Nope, in almost all cases correct behavior is evident, the only remaining question is which variants (combinations) are supported and which are not.. You can't change reality by denying arguments that don't match with your theory. So, assume that foo is STB_GLOBAL for rest of this discussion.
Then, please, tell me what exactly is fuzzy if the above code is used in
a) static environment or
b) dynamic environment but given symbol is not interposed.
I have asked for this several times, always without response. And, please, be very exact and precise – this is key point (at least for me).

My answer is very clear and clear - the behavior in all cases is very well defined. You can identify incorrect cases and report errors, but you cannot broke legal use. It's always better if the compiler silently accepts the wrong code rather than rejecting a valid one. So again and again – You broke perfectly valid code.

If it is convenient for you, please estimate how much code uses STB_GLOBAL. It will be more persuasive than simply calling it "a regression". If a code problem is widespread, then toolchain may still have to work around them.

Huh, isn't that exactly what you should have done before you broke given status quo? But if the android, freebsd and linux kernel is not enough (you broke all them, right?), be sure that using “adr” on the global symbols defined in the same section is a common way to describe the various data structures in assembler and then use them in C.

  1. MC cannot, in any case, refuse to compile valid code only because same code can be invalid in other environment.

There is a widely agreed rule: a direct access (non-GOT non-PLT) relocation referencing a STV_DEFAULT STB_GLOBAL symbol is not allowed. Deviating from it is considered incorrect.
In this context, GNU as does not produce these short range relocations. Instead (inconsistent with other ports of GNU as), such fixups are resolved at assembly time.
Such mistakes can be easily made: binutils code does not have a good abstraction layer for relocation types on different architectures. It is easy to omit something for some relocation types.
For example, the PowerPC arch maintainer has admitted R_PPC_ADDR16_LO is a problem, though they will probably refuse fixing the problem.
I don't have access to an ARM assembler and don't know how it behaves.

All ARM assemblers was consistent before your change.
Give me at least some sort of evidence of this “widely agreed rule“ and mainly about it applicability to assembler code.
So you want say that every line in following example is wrong in every environment? In that case be sure that all projects (ports) are broken.

        .text
        .globl foo
foo:
        ldr     r0, =foo
        adr     r1, foo
        movw    r2,#:lower16:foo
        movt    r2,#:upper16:foo
        b       foo
        .word   foo

As you can see I wrote "valid code". Of course, you may name "valid code" as "undefined behavior", but this doesn't mean that all others will be on same wave. For me and all FreeBSD developers, the locore-v6.S code is fully valid and you haven't been able to tell why exactly that code is invalid.

  1. Situation when new version of given program fails for valid sources, but previous version worked as expected, is called regression.

It can be an undefined behavior in metaqwe's words. clang can be loose and incorrectly allow some code constructs while a future release may disallow them.

As you can see I wrote "valid sources". Of course, you may name "valid sources" as "undefined behavior", but this doesn't mean that all others will be on same wave. For me and all FreeBSD developers, the locore-v6.S code is fully valid and you haven't been able to tell why exactly that code is invalid.

  1. Under standard open source policy, the regression is always release blocker and it cannot be resolved as won’t fix.

Every regression should be carefully treated, but for this case it is hard to say this issue is a regression.

Again nope, all others says you that this is regression. The situation is very clear in this point.

  1. Marking regression as won’t fix is synonym for “f**k with users”.

Just look, I fully understand that symbol interposition is “problematic” in these cases. However you cannot, in any case, expect that given assembly unit will be used in environment where symbol interposition is allowed. Or worse, you can't expect that given symbol will be interposed. All you can do is to issue a warning if there is a high probability that the problem may occur. Or issue an error if the problem has already occurred. Nothing more.

I apologize for this rant talk, but this issue makes me really angry. Imho, this is exact reason why MC have very bad reputation in arm community, why is very problematic to pass patch to upstream with "this code is not compiled by MC but is valid for all others assemblers".
And like always - we (FreeBSD) cannot compile more than 2/3 of ports with assembler files just because MC doesn't support pre-UAL syntax and upstream doesn't want to switch to UAL. This is very frustrating situation.

The other ARM assembly issues are unrelated to this particular short range relocation issue. I have the viewpoint that we should be careful accepting new features. It is not uncommon to bring legacy stuff into a project which does not play well with other things. Accepting legacy stuff into a compiler/assembler/linker may be easy to paper over some code issues, but it may not be healthy for the project in the long term.

Hahaha, this is joke, right? Or you really thinks that rejection of 2/3 of existing code is sign of healthy project? If so, I am afraid you are the only one who has this opinion. And you don't know how much time this decision costs to others (me including)

For this case, Peter kindly offers help implementing these relocation types. They will make the overall assembler+linker model sane.

I agree with thas, so Peter@ many many many thanks !!!

Though, I am nervous about the complexity. I don't know enough of ARM and I don't know how to estimate the code which actually uses such short range relocations referencing STB_GLOBAL symbols.

But this complexity is direct result of your incorrect assumptions and patch, sorry.
Michal

@strejda Apologies for any problems caused, but I am afraid I cannot agree with all the points.

hmm, I discussed with MC developer on #bsdmips :). Hope he can make MC more conform to widely accepted standards, without issuing more and more regressions, and to understand that:

  1. The code in locore_v6.S is absolutely valid. It is used in strictly statically linked environment to fixed address, so every reference to interposed symbol handling is absurd.

adr r0, foo When foo is STB_LOCAL, this is true. When foo is STB_GLOBAL, I have to say it is fuzzy.

Nope, in almost all cases correct behavior is evident, the only remaining question is which variants (combinations) are supported and which are not.. You can't change reality by denying arguments that don't match with your theory. So, assume that foo is STB_GLOBAL for rest of this discussion.
Then, please, tell me what exactly is fuzzy if the above code is used in
a) static environment or
b) dynamic environment but given symbol is not interposed.
I have asked for this several times, always without response. And, please, be very exact and precise – this is key point (at least for me).

My answer is very clear and clear - the behavior in all cases is very well defined. You can identify incorrect cases and report errors, but you cannot broke legal use. It's always better if the compiler silently accepts the wrong code rather than rejecting a valid one. So again and again – You broke perfectly valid code.

I have argued many times it is not clear the "label" can be STB_GLOBAL for such short range fixups, but receive no response. If not, they are not "very well defined". It is not a problem that a conforming assembler does not support such STB_GLOBAL labels.

If it is convenient for you, please estimate how much code uses STB_GLOBAL. It will be more persuasive than simply calling it "a regression". If a code problem is widespread, then toolchain may still have to work around them.

Huh, isn't that exactly what you should have done before you broke given status quo? But if the android, freebsd and linux kernel is not enough (you broke all them, right?), be sure that using “adr” on the global symbols defined in the same section is a common way to describe the various data structures in assembler and then use them in C.

So far Android and ChromeOS are good. "freebsd and linux kernel is not enough" =>
I have checked this does not affect compiler generated code. There is one assembly instance in Linux. I am helping ClangBuiltLinux, so I'll help investigate and fix it anyway. They are cooperative if there are indeed issues from the kernel side. (But of course, there can be stubborn opinions in the Linux kernel community, and sometimes the toolchain has to cope.)

  1. MC cannot, in any case, refuse to compile valid code only because same code can be invalid in other environment.

There is a widely agreed rule: a direct access (non-GOT non-PLT) relocation referencing a STV_DEFAULT STB_GLOBAL symbol is not allowed. Deviating from it is considered incorrect.
In this context, GNU as does not produce these short range relocations. Instead (inconsistent with other ports of GNU as), such fixups are resolved at assembly time.
Such mistakes can be easily made: binutils code does not have a good abstraction layer for relocation types on different architectures. It is easy to omit something for some relocation types.
For example, the PowerPC arch maintainer has admitted R_PPC_ADDR16_LO is a problem, though they will probably refuse fixing the problem.
I don't have access to an ARM assembler and don't know how it behaves.

All ARM assemblers was consistent before your change.
Give me at least some sort of evidence of this “widely agreed rule“ and mainly about it applicability to assembler code.
So you want say that every line in following example is wrong in every environment? In that case be sure that all projects (ports) are broken.

        .text
        .globl foo
foo:
        ldr     r0, =foo
        adr     r1, foo
        movw    r2,#:lower16:foo
        movt    r2,#:upper16:foo
        b       foo
        .word   foo

As you can see I wrote "valid code". Of course, you may name "valid code" as "undefined behavior", but this doesn't mean that all others will be on same wave. For me and all FreeBSD developers, the locore-v6.S code is fully valid and you haven't been able to tell why exactly that code is invalid.

Please measure, instead of asserting "all projects (ports) are broken".

All ARM assemblers was consistent => it does not necessarily mean they are all correct.
https://reviews.llvm.org/D75416 => Both clang and GCC violate a clear ELF specification by not placing .toc in a section group if required => GNU ld happily accepts it.
nullptr cannot be implicitly converted to Wrap<bool> => at some point both clang and GCC were wrong => GCC fixed it => clang fixed it a few weeks ago. I can argue before GCC fixed it => all (probably not all, but close) C++ compilers were consistent, but that can't be the reason to make a change.

  1. Situation when new version of given program fails for valid sources, but previous version worked as expected, is called regression.

It can be an undefined behavior in metaqwe's words. clang can be loose and incorrectly allow some code constructs while a future release may disallow them.

As you can see I wrote "valid sources". Of course, you may name "valid sources" as "undefined behavior", but this doesn't mean that all others will be on same wave. For me and all FreeBSD developers, the locore-v6.S code is fully valid and you haven't been able to tell why exactly that code is invalid.

  1. Under standard open source policy, the regression is always release blocker and it cannot be resolved as won’t fix.

Every regression should be carefully treated, but for this case it is hard to say this issue is a regression.

Again nope, all others says you that this is regression. The situation is very clear in this point.

Your starting point is not clear => it is not clear the issue is a regression in the first place.

  1. Marking regression as won’t fix is synonym for “f**k with users”.

Just look, I fully understand that symbol interposition is “problematic” in these cases. However you cannot, in any case, expect that given assembly unit will be used in environment where symbol interposition is allowed. Or worse, you can't expect that given symbol will be interposed. All you can do is to issue a warning if there is a high probability that the problem may occur. Or issue an error if the problem has already occurred. Nothing more.

I apologize for this rant talk, but this issue makes me really angry. Imho, this is exact reason why MC have very bad reputation in arm community, why is very problematic to pass patch to upstream with "this code is not compiled by MC but is valid for all others assemblers".
And like always - we (FreeBSD) cannot compile more than 2/3 of ports with assembler files just because MC doesn't support pre-UAL syntax and upstream doesn't want to switch to UAL. This is very frustrating situation.

The other ARM assembly issues are unrelated to this particular short range relocation issue. I have the viewpoint that we should be careful accepting new features. It is not uncommon to bring legacy stuff into a project which does not play well with other things. Accepting legacy stuff into a compiler/assembler/linker may be easy to paper over some code issues, but it may not be healthy for the project in the long term.

Hahaha, this is joke, right? Or you really thinks that rejection of 2/3 of existing code is sign of healthy project? If so, I am afraid you are the only one who has this opinion. And you don't know how much time this decision costs to others (me including)

For this case, Peter kindly offers help implementing these relocation types. They will make the overall assembler+linker model sane.

I agree with thas, so Peter@ many many many thanks !!!

Though, I am nervous about the complexity. I don't know enough of ARM and I don't know how to estimate the code which actually uses such short range relocations referencing STB_GLOBAL symbols.

But this complexity is direct result of your incorrect assumptions and patch, sorry.
Michal

I had a very long answer, but just before sending it, I deleted it. We should return to beginning.

I saying that ‘ .globl foo; foo: adr r0, foo" is absolutely valid code if is used in statically compiled executable.
By words, is absolutely OK to take reference to global symbol by 'adr' pseudo instruction if given object is linked to pure static environment.

You saying that this code snippet is undefined. So imagine I'm an absolute newcomer and tell me exactly why it's wrong and what isn't exactly defined.
If you give me single real evidence I'll apologize and immediately send the patch to FreeBSD.

I hope that this is fair offer right?

And only FYI :
https://lists.gnu.org/archive/html/bug-binutils/2015-06/msg00074.html
https://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6855/1
Michal

MaskRay added a comment.EditedMar 2 2020, 12:25 PM

I had a very long answer, but just before sending it, I deleted it. We should return to beginning.

I saying that ‘ .globl foo; foo: adr r0, foo" is absolutely valid code if is used in statically compiled executable.
By words, is absolutely OK to take reference to global symbol by 'adr' pseudo instruction if given object is linked to pure static environment.

You saying that this code snippet is undefined. So imagine I'm an absolute newcomer and tell me exactly why it's wrong and what isn't exactly defined.
If you give me single real evidence I'll apologize and immediately send the patch to FreeBSD.

I hope that this is fair offer right?

And only FYI :
https://lists.gnu.org/archive/html/bug-binutils/2015-06/msg00074.html
https://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6855/1
Michal

So, in GNU as, resolving the fixup (both GNU as and LLVM MC use the term "fixup") is only done when the relocation and the definition are in the same section. -ffunction-sections can make this untrue.

adr r0, cp15_save_power

.section .text.foo,"ax"
.globl cp15_save_power
cp15_save_power:

# GNU as (ARM): Error: undefined symbol cp15_save_power used as an immediate value
# GNU as -mthumb: Error: address calculation needs a strongly defined nearby symbol
# Current llvm-mc -triple=armv7: Resolved the fixup
# Current llvm-mc -triple=thumbv7: R_ARM_THM_ALU_PREL_11_0

When the symbol is undefined:

adr r0, cp15_save_power

# GNU as (ARM): Error: undefined symbol cp15_save_power used as an immediate value
# GNU as -mthumb: Error: address calculation needs a strongly defined nearby symbol
# Current llvm-mc -triple=armv7: error: unsupported relocation on symbol
# Current llvm-mc -triple=thumbv7: R_ARM_THM_ALU_PREL_11_0

The FreeBSD ARM developers' attitude made me feel very sad:

@yyyyyy         Your arguments are not very well thought out technically. The code is in our tree, and it works with other tool chains. That shows that there is a use case. Why are you being so stubborn and bringing up things that re not relevant. It doesn't matter what other systems may have avoided this construct. The proof that it's useful is that we use it and our kernel works. What other proof is needed beyond that?
@xxxxxxxxx      and I also reject all this bullshit about "it makes everything too complex" and "it adds 200 lines of code".
@xxxxxxxxx      The code to have the assembler calculate the offset is already in place.  it has to be, to handle non-global symbols.

+@ostannard @efriedma

Given the poor support of these relocations in GNU as, do we still need to produce relocations?

Another FreeBSD developer told me that I should persist and @strejda may change their mind to fix the FreeBSD code.

I am just concerned with the complexity in D75039 and D75349. I should mention that so far Android, Chrome and ChromeOS (@nickdesaulniers @manojgupta for visibility) are good. If they find any project that uses such relocation types, I will ask them to avoid adr/ldr referencing a STB_GLOBAL symbol because the assembler portability isn't that great anyway, and it is not bad to avoid constructs involving a dark side of the linker.

If we decide to not support such short range relocation types, we can emit a friendly error message for LLVM 11, telling the user to use a local alias. My current feeling is that 0.001% code might accidentally use them.

psmith added a comment.Mar 3 2020, 2:04 AM

My personal view is that we should be able to support adr and ldr to a global symbol defined in the same section, whether at link time or at assembly time. As well as Linux, Android and BSD there is large number of embedded systems that use assembler and are entirely statically linked that we risk breaking without a good error message. I would like to avoid that situation if at all possible. I don't think we need to support adr and ldr to symbols defined outside the section as the range is too short to practically use them.

My summary of the options:
1.) Revert D72892, D75039, don't apply D75349 and we don't support adr, ldr to a global symbol, period.

  • This breaks existing code that if only run in an executable is fine. Static linking of executables encompasses almost all embedded systems code where assembly, and use of adr/ldr is most likely to occur.
  • This protects authors that run their code in a shared library and end up with a broken program due to symbol interpositioning.

2.) Keep this patch, revert D75039 and don't apply D75349, essentially evaluate adr and ldr at assembly time even to global symbols. All other expressions use relocations.

  • We keep all existing code building
  • We protect authors from symbol interpositioning apart from adr and ldr.
  • For adr and ldr to a global symbol, it is user error to use the resulting object in a shared object and interpositioning occurs.

3.) Implement D75349 and expose adr and ldr as relocations, the linker decides if the use is valid.

  • We keep all existing code building assuming linker can support the relocations.
  • Linker knows the context, shared or non-shared and can decide whether to permit adr/ldr to globals in shared contexts. My existing LLD patches don't do that, but follow on patches could.
  • Resolving the relocations, particularly the Arm ADR adds complexity to the linker.

I'd be happy with 2 or 3, my reason for advocating for 3 is that it could address most of the concerns at the cost of linker complexity, I'd prefer not to go back to 1 if at all possible.

Many thanks for constructive response. For me either from 2 or 3 is fully OK.
But may I add other variant?
Do 2 for 'adr' but 3 for rest ('ldr'...). This removes unnecessary linker complexity while holding all other useful features.
The motivation for this is:

  • Relocation for 'adr' cannot be implemented in a dynamic environment at all (imho, there is no dynamic relocation suitable for 'adr' instructions)
  • In static environment,usage should be limited to same compilation unit, same section. Extremely small relocation range doesn't allow other usage without significant chance of relocation overflow in final linking.

Emitting the relocations seems conceptually sound: the cases that make sense work, and the linker emits a reasonable error for the cases that don't. My concern is that the user is likely to get a weird error message if they use a non-lld linker. What do GNU linkers print?

Carving out an exception for ldr/adr is inconsistent at a theoretical level. I'm weakly opposed to this because exceptions make it harder for users to understand the rules. But realistically, I don't think it's likely to cause practical problems: anyone who's writing ARM assembly should be aware that ldr/adr have a very limited range.

If we decide we should in fact reject the construct, we can and should improve the diagnostics.

psmith added a comment.Mar 6 2020, 5:00 AM

Emitting the relocations seems conceptually sound: the cases that make sense work, and the linker emits a reasonable error for the cases that don't. My concern is that the user is likely to get a weird error message if they use a non-lld linker. What do GNU linkers print?

In a shared object with a global preemptible definition of X, ld.bfd will resolve the adr and ldr relocations with respect to X, and not any PLT or GOT entry for X. If X is in range this will be the same behaviour as if the assembler resolves the relocation. If X is too far away there will be an out of range error. A disadvantage of emitting a relocation is that instead of a cryptic assembler error message, we get a relocation out of range error message. This isn't ideal as the error is delayed to link time.

Apologies for the delay in responding.

Oh, if binutils is assuming the symbol isn't preemptible, there isn't much point to generating the relocation and waiting for the linker to resolve it; we should just resolve it the same way in the assembler.

Oh, if binutils is assuming the symbol isn't preemptible, there isn't much point to generating the relocation and waiting for the linker to resolve it; we should just resolve it the same way in the assembler.

My understanding is that fixing this at link time is falling out of favour. We still have some work to do here and I'd like to agree a plan on what to do next. Current state:

  • Arm state adr and ldr instructions are resolved at assembly time. Unfortunately it looks like we resolve even the cross section references which leads to incorrect code. No existing code will have this problem as it used to give an error message, but it is something we need to fix.
  • Thumb state adr and ldr instructions are resolved at link time via relocations
  • LLD Thumb state support for adr and ldr relocations have landed, depending on the assembler support for tests.
  • LLD Arm state support has not yet landed. It is more complex due to the adr using modified immediate addressing.

If we are to resolve the relocations at assembly time then we need to decide:

  • What do we do about adr and ldr to symbols outside the section? We can expose as relocations or make an error message. Currently we resolve them prematurely, most likely getting the wrong result. I have written the patches for LLD support and the binutils linkers have support so I'd prefer that.
  • What do we do about the LLD support for Thumb relocations? If we remove assembler support then we can either revert the patch adding them or I can use yaml2obj. I'd prefer the latter.
  • Is there any hope for the LLD support for Arm adr and ldr relocations? If there is none I'll abandon the patch. I'm happy to update tests to use yaml2obj if assembler support is removed.

My preference is:

  • Resolve fixup at assembly time where possible, emit relocation when not.
  • Keep LLD support for Thumb and add support for Arm, the tests have been written to only use adr and ldr that cannot be resolved at assembly time. But I would say that as I don't want to waste all of the effort put into writing it. I'm happy to go with the consensus.

The problem with early resolution in the current implementation is:

IsResolved = (FixupFlags & MCFixupKindInfo::FKF_Constant) ||
             Writer->isSymbolRefDifferenceFullyResolvedImpl(
                 *this, SA, *DF, false, true);

We should add an additional check that the relocation and destination are in the same section for FKF_Constant. A quick hack that fixes this is:

 IsResolved = ((FixupFlags & MCFixupKindInfo::FKF_Constant) &&
                Writer->isSymbolRefDifferenceFullyResolvedImpl(
                    *this, SA, *DF, false, false)) ||
               Writer->isSymbolRefDifferenceFullyResolvedImpl(
                   *this, SA, *DF, false, true);
}

This lies about being pc-relative to isSymbolRefDifferenceFullyResolvedImpl which will ultimately call MCObject::isSymbolRefDifferenceFullyResolvedImpl which will test that the fixup and destination are in the same section. There is likely a better way though.

Oh, if binutils is assuming the symbol isn't preemptible, there isn't much point to generating the relocation and waiting for the linker to resolve it; we should just resolve it the same way in the assembler.

My understanding is that fixing this at link time is falling out of favour. We still have some work to do here and I'd like to agree a plan on what to do next. Current state:

  • Arm state adr and ldr instructions are resolved at assembly time. Unfortunately it looks like we resolve even the cross section references which leads to incorrect code. No existing code will have this problem as it used to give an error message, but it is something we need to fix.
  • Thumb state adr and ldr instructions are resolved at link time via relocations
  • LLD Thumb state support for adr and ldr relocations have landed, depending on the assembler support for tests.
  • LLD Arm state support has not yet landed. It is more complex due to the adr using modified immediate addressing.

If we are to resolve the relocations at assembly time then we need to decide:

  • What do we do about adr and ldr to symbols outside the section? We can expose as relocations or make an error message. Currently we resolve them prematurely, most likely getting the wrong result. I have written the patches for LLD support and the binutils linkers have support so I'd prefer that.
  • What do we do about the LLD support for Thumb relocations? If we remove assembler support then we can either revert the patch adding them or I can use yaml2obj. I'd prefer the latter.
  • Is there any hope for the LLD support for Arm adr and ldr relocations? If there is none I'll abandon the patch. I'm happy to update tests to use yaml2obj if assembler support is removed.

My preference is:

  • Resolve fixup at assembly time where possible, emit relocation when not.

This is the same as GNU as. +1

  • Keep LLD support for Thumb and add support for Arm, the tests have been written to only use adr and ldr that cannot be resolved at assembly time. But I would say that as I don't want to waste all of the effort put into writing it. I'm happy to go with the consensus.

More precisely, keep code change in D75349 but rewrite the tests with yaml2obj plus output section address. The lld code path will be very difficult to exercise.

The problem with early resolution in the current implementation is:

IsResolved = (FixupFlags & MCFixupKindInfo::FKF_Constant) ||
             Writer->isSymbolRefDifferenceFullyResolvedImpl(
                 *this, SA, *DF, false, true);

We should add an additional check that the relocation and destination are in the same section for FKF_Constant. A quick hack that fixes this is:

 IsResolved = ((FixupFlags & MCFixupKindInfo::FKF_Constant) &&
                Writer->isSymbolRefDifferenceFullyResolvedImpl(
                    *this, SA, *DF, false, false)) ||
               Writer->isSymbolRefDifferenceFullyResolvedImpl(
                   *this, SA, *DF, false, true);
}

This lies about being pc-relative to isSymbolRefDifferenceFullyResolvedImpl which will ultimately call MCObject::isSymbolRefDifferenceFullyResolvedImpl which will test that the fixup and destination are in the same section. There is likely a better way though.

I've created a review to change the existing LLD Thumb relocations to obj2yaml with D76575