This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Don't relax R_X86_64_GOTPCRELX if addend != -4
ClosedPublic

Authored by MaskRay on Nov 23 2020, 1:05 PM.

Details

Summary

clang may produce movl x@GOTPCREL+4(%rip), %eax when loading the high 32 bits
of the address of a global variable in -fpic/-fpie mode.

If assembled by GNU as, the fixup emits an R_X86_64_GOTPCRELX with an
addend != -4. The instruction loads from the GOT entry with an offset
and thus it is incorrect to relax the instruction.

If assembled by the integrated assembler, we emit R_X86_64_GOTPCREL for
relocations that definitely cannot be relaxed (D92114), so this patch is not
needed.

This patch disables the relaxation, which is compatible with the implementation in GNU ld
("Add R_X86_64_[REX_]GOTPCRELX support to gas and ld").

Diff Detail

Event Timeline

MaskRay created this revision.Nov 23 2020, 1:05 PM
MaskRay requested review of this revision.Nov 23 2020, 1:05 PM
grimar added a comment.EditedNov 24 2020, 12:07 AM

My understanding is that R_X86_64_*RELX relocation should never be emitted when a relaxation is not possible.
ABI doesn't say anything about addends I think ("B.2 Optimize GOTPCRELX Relocations", p151, https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf).
So, why should this be handled on linker side and not on the compiler side?

My understanding is that R_X86_64_*RELX relocation should never be emitted when a relaxation is not possible.
ABI doesn't say anything about addends I think ("B.2 Optimize GOTPCRELX Relocations", p151, https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf).
So, why should this be handled on linker side and not on the compiler side?

There are two questions:

  • Whether movl x@GOTPCREL+4(%rip), %eax is valid?
  • Whether the expression should assemble to R_X86_64_GOTPCRELX?

For the first, I tend to think it is valid - this allows the compiler generates an instruction to load the half part of the GOT entry.
For the second, my understanding is that R_X86_64_GOTPCREL can be entirely replaced by R_X86_64_[REX_]GOTPCRELX. If so, it is certainly reasonable to assemble to R_X86_64_GOTPCRELX. I can send a clarification on binutils, though.

MaskRay added a comment.EditedNov 24 2020, 10:11 AM

My understanding is that R_X86_64_*RELX relocation should never be emitted when a relaxation is not possible.
ABI doesn't say anything about addends I think ("B.2 Optimize GOTPCRELX Relocations", p151, https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf).
So, why should this be handled on linker side and not on the compiler side?

There are two questions:

  • Whether movl x@GOTPCREL+4(%rip), %eax is valid?
  • Whether the expression should assemble to R_X86_64_GOTPCRELX?

For the first, I tend to think it is valid - this allows the compiler generates an instruction to load the half part of the GOT entry.
For the second, my understanding is that R_X86_64_GOTPCREL can be entirely replaced by R_X86_64_[REX_]GOTPCRELX. If so, it is certainly reasonable to assemble to R_X86_64_GOTPCRELX. I can send a clarification on binutils, though.

Asked on binutils and HJ Lu said movl x@GOTPCREL+4(%rip), %eax is valid (https://sourceware.org/pipermail/binutils/2020-November/114264.html) and R_X86_64_GOTPCRELX is correct.
So I think we should proceed with this patch. I also created a gold bug report: https://sourceware.org/bugzilla/show_bug.cgi?id=26939

grimar added a comment.EditedNov 24 2020, 11:38 PM

My understanding is that R_X86_64_*RELX relocation should never be emitted when a relaxation is not possible.
ABI doesn't say anything about addends I think ("B.2 Optimize GOTPCRELX Relocations", p151, https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf).
So, why should this be handled on linker side and not on the compiler side?

There are two questions:

  • Whether movl x@GOTPCREL+4(%rip), %eax is valid?
  • Whether the expression should assemble to R_X86_64_GOTPCRELX?

For the first, I tend to think it is valid - this allows the compiler generates an instruction to load the half part of the GOT entry.
For the second, my understanding is that R_X86_64_GOTPCREL can be entirely replaced by R_X86_64_[REX_]GOTPCRELX. If so, it is certainly reasonable to assemble to R_X86_64_GOTPCRELX. I can send a clarification on binutils, though.

Asked on binutils and HJ Lu said movl x@GOTPCREL+4(%rip), %eax is valid (https://sourceware.org/pipermail/binutils/2020-November/114264.html) and R_X86_64_GOTPCRELX is correct.
So I think we should proceed with this patch. I also created a gold bug report: https://sourceware.org/bugzilla/show_bug.cgi?id=26939

OK. Thanks for the information. I still don't understand why the direction is to add job for the linker instead of not emitting a potentially relaxable relocation from the begining on the compiler side.
But since it is what will be added to x86-64-ABI, I have no arguments for not doing this in LLD.

I'd add the TargetInfo::adjustGotPcExpr in a separate patch though to avoid having related PPC64 changes.

lld/ELF/Arch/PPC64.cpp
1395 ↗(On Diff #307180)

It feels that splitting the adjustRelaxExpr to adjustGotPcExpr and renaming the first one to adjustTlsExpr should be done separatelly
and then this diff will only contain X86_64 related code?

lld/test/ELF/x86-64-gotpc-offset.s
2

We have x86-64-gotpc-relax.s test case for testing relaxations. I think this test should live there?

MaskRay updated this revision to Diff 307530.Nov 25 2020, 12:29 AM
MaskRay marked an inline comment as done.
MaskRay edited the summary of this revision. (Show Details)

Extract refactoring part

lld/test/ELF/x86-64-gotpc-offset.s
2

That file is mainly about different instructions... We can use a new file for offsets, I think ....

grimar accepted this revision.Nov 25 2020, 12:47 AM

LGTM.

This revision is now accepted and ready to land.Nov 25 2020, 12:47 AM
jhenderson added a subscriber: jhenderson.EditedNov 25 2020, 6:30 AM

FWIW, I'm with @grimar in the sense that I think this is poor and unnecessary behaviour provided for by the psABI. Why emit the relocation type in this context, when the primary purpose (assuming I'm not mistaken) of this relocation type is to enable relaxations? Why not emit the generic GOTPCREL relocation?

MaskRay added a comment.EditedNov 25 2020, 8:49 AM

FWIW, I'm with @grimar in the sense that I think this is poor and unnecessary behaviour provided for by the psABI. Why emit the relocation type in this context, when the primary purpose (assuming I'm not mistaken) of this relocation type is to enable relaxations? Why not emit the generic GOTPCREL relocation?

MC could emit R_X86_64_GOTPCREL in this case. Then call GNU as' emitting R_X86_64_GOTPCRELX a bug?

Hmm, I think the assembler behavior is fine, because R_X86_64_GOTPCREL in all cases can be replaced by R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX, if the user does not mind the instructions are relaxed.

Note, R_X86_64_GOTPCRELX already does not guarantee ensured relaxation, e.g. when the symbol is preemptible the linker cannot relax the instruction, when the instruction is not one of call,jmp,adc,add,and,cmp,or,sbb,sub,test,xor, etc.

If we want to support clang -fno-integrated-as + LLD, we still need this.

I think making the check on the MC side still makes sense: D92114

Note, R_X86_64_GOTPCRELX already does not guarantee ensured relaxation, e.g. when the symbol is preemptible the linker cannot relax the instruction, when the instruction is not one of call,jmp,adc,add,and,cmp,or,sbb,sub,test,xor, etc.

Seems that, e.g., R_X86_64_GOTPCREL should give a guarantee that relaxation will not be performed though. As far I understand GNU linkers can relax R_X86_64_GOTPCREL too? It looks like a violation of ABI?
The suggested change (for ABI) about addend then seems only needed to support this behavior of GNU linkers. In LLVM we probably should just not emit a potentially relaxable relocation from the begining.

FWIW, I'm with @grimar in the sense that I think this is poor and unnecessary behaviour provided for by the psABI. Why emit the relocation type in this context, when the primary purpose (assuming I'm not mistaken) of this relocation type is to enable relaxations? Why not emit the generic GOTPCREL relocation?

MC could emit R_X86_64_GOTPCREL in this case. Then call GNU as' emitting R_X86_64_GOTPCRELX a bug?

I wouldn't call it a bug in the assembler, as my understanding from your comments earlier is that it is conformant with the ABI. I do think I'd consider it a possible bug in the psABI though, but maybe I'm misunderstanding the intent of GOTPCRELX (I thought it indicated that the target place is a candidate for relaxation, which the linker may choose to do).

Hmm, I think the assembler behavior is fine, because R_X86_64_GOTPCREL in all cases can be replaced by R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX, if the user does not mind the instructions are relaxed.

Note, R_X86_64_GOTPCRELX already does not guarantee ensured relaxation, e.g. when the symbol is preemptible the linker cannot relax the instruction, when the instruction is not one of call,jmp,adc,add,and,cmp,or,sbb,sub,test,xor, etc.

Seems that, e.g., R_X86_64_GOTPCREL should give a guarantee that relaxation will not be performed though. As far I understand GNU linkers can relax R_X86_64_GOTPCREL too? It looks like a violation of ABI?

I think our downstream platform has previously relaxed GOTPCREL too (although I'd have to go back and check). I don't think it hurts for a linker to perform an optimisation it can see is safe. I wouldn't say it's a violation of the ABI in this case, rather just something not covered by the ABI. Probably in all those cases that it did that, the relocation is now a GOTPCRELX, but I don't think it always was.

The suggested change (for ABI) about addend then seems only needed to support this behavior of GNU linkers. In LLVM we probably should just not emit a potentially relaxable relocation from the begining.

Strictly, it could be argued that we'd actually be violating the ABI in that case, though I don't disagree with it. The psABI states that in the case of mov name@GOTPCREL(%rip), %reg a GOTPCRELX relocation should be emitted. I'm note convinced that pattern specifically states the same for the +4 version you mentioned before, but HJ Lu seems to suggest it does, so if we didn't emit a GOTPCRELX, we'd be ignoring that statement.

Anyway, that's the other patch (D92114). Since we'll have to accept the GNU assembler output, it probably makes sense to do this patch. I haven't reviewed the content, but seems okay in principle. Feel free to commit once reviewers are happy.

MaskRay added a comment.EditedNov 26 2020, 9:28 AM

Note, R_X86_64_GOTPCRELX already does not guarantee ensured relaxation, e.g. when the symbol is preemptible the linker cannot relax the instruction, when the instruction is not one of call,jmp,adc,add,and,cmp,or,sbb,sub,test,xor, etc.

Seems that, e.g., R_X86_64_GOTPCREL should give a guarantee that relaxation will not be performed though. As far I understand GNU linkers can relax R_X86_64_GOTPCREL too? It looks like a violation of ABI?
The suggested change (for ABI) about addend then seems only needed to support this behavior of GNU linkers. In LLVM we probably should just not emit a potentially relaxable relocation from the begining.

GNU ld and gold can relax R_X86_64_GOTPCREL if the instruction is a movq. This behavior is not documented in the x86-64 psABI. I think LLD's choice is more sensible: no relaxation for R_X86_64_GOTPCREL.

(GNU ld's behavior is actually very complex (https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=56ceb5b5405af23eddd12e12d8ba849010120324 not changed a lot since it was committed))

I agree that it is probably impractical for us to change the x86-64 psABI now since the GNU ld behavior was set in stone in 2015 and the psABI is somewhat more about documenting the existing practice and less about what ought to be done.

Now as I see the purpose of this patch, it is entirely to make -fno-integrated-assembler work since GNU as is likely not going to change.

MaskRay updated this revision to Diff 307902.Nov 26 2020, 9:49 AM
MaskRay edited the summary of this revision. (Show Details)

Improve tests and comments

jhenderson added inline comments.Nov 27 2020, 12:24 AM
lld/test/ELF/x86-64-gotpc-offset.s
7–11

Maybe worth using FileCheck numeric variables to caclulate the address values here, so that if LLD lays things out slightly differently in the future, the test doesn't need updating. It also helps make it easier to spot the value differences. See suggested edit for an example.

(I've also added a {{^}} to the start of the first line so that it doesn't accidentally match an address pattern elsewhere in the line)

13

Here and below. The single hash makes it look like the line has some sort of functional affect or is some commented out code that was accidentally left in.

MaskRay updated this revision to Diff 308085.Nov 27 2020, 9:49 AM
MaskRay marked 2 inline comments as done.

comments

jhenderson accepted this revision.Nov 30 2020, 12:17 AM

Looks good from my point of view. Might want @grimar to give thumbs up again too.

grimar accepted this revision.Nov 30 2020, 12:31 AM

LGTM too.

This revision was automatically updated to reflect the committed changes.