This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Support R_X86_64_GOTPC{32,64}
ClosedPublic

Authored by MaskRay on May 18 2018, 5:57 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

MaskRay created this revision.May 18 2018, 5:57 PM
ruiu added inline comments.May 18 2018, 6:36 PM
ELF/Arch/X86_64.cpp
328 ↗(On Diff #147627)

Don't you need to add a case for GOTPC64?

MaskRay updated this revision to Diff 147651.May 18 2018, 11:19 PM

Update tests

MaskRay retitled this revision from [ELF] Support R_X86_64_GOTPC32 and R_X86_64_GOTOFF64 to [ELF] Make R_GOTONLY_PC_FROM_END R_GOTREL_FROM_END relative to _GLOBAL_OFFSET_TABLE_ instead of end of .got, and support R_X86_64_GO│····· TPC32 and R_X86_64_GOTOFF64.May 18 2018, 11:20 PM
MaskRay edited the summary of this revision. (Show Details)
MaskRay updated this revision to Diff 147652.May 18 2018, 11:36 PM
MaskRay edited the summary of this revision. (Show Details)

Add case R_X86_64_GOTPC64: (used in -mcmodel=large)

MaskRay retitled this revision from [ELF] Make R_GOTONLY_PC_FROM_END R_GOTREL_FROM_END relative to _GLOBAL_OFFSET_TABLE_ instead of end of .got, and support R_X86_64_GO│····· TPC32 and R_X86_64_GOTOFF64 to [ELF] Make R_GOTONLY_PC_FROM_END R_GOTREL_FROM_END relative to _GLOBAL_OFFSET_TABLE_ instead of end of .got.May 18 2018, 11:40 PM
MaskRay edited the summary of this revision. (Show Details)
MaskRay marked an inline comment as done.May 19 2018, 12:07 AM

Friendly ping :)

ruiu added a comment.May 24 2018, 6:04 PM

I'm sorry I missed this one.

ELF/InputSection.cpp
482 ↗(On Diff #147652)

I wouldn't define this local variable as it doesn't add much value to the code.

I suggest separating R_X86_64_GOTOFF64 change from this patch to different one. I think it is completely independent.

Also, you are creating "relocation" folder for the new test. Please do not do it, just place the test near our others tests we have for the relocations.
(If you think it is worth to add a new folder for relocations it should be in a separate review and/or discussed first. I probably would not do that, btw).

test/ELF/relocation-i686.s
3 ↗(On Diff #147652)

Why you had to do that?

ruiu added a comment.May 25 2018, 1:37 PM

Also, you are creating "relocation" folder for the new test. Please do not do it, just place the test near our others tests we have for the relocations.
(If you think it is worth to add a new folder for relocations it should be in a separate review and/or discussed first. I probably would not do that, btw).

Thank you for noticing this. I didn't notice when I review this patch. I'll move that file to the parent directory.

Fangrui,

Please follow the local convention whenever possible, and if you have to do something new, please mention that in the patch description. That helps reviewers a lot. Thanks!

ruiu added a comment.May 25 2018, 1:38 PM

Ah, OK, this patch is not submitted yet. Please update the patch according to George's comment.

MaskRay updated this revision to Diff 148702.May 25 2018, 11:44 PM
MaskRay edited the summary of this revision. (Show Details)

Rename test file as grimar@ suggested

MaskRay updated this revision to Diff 148703.May 25 2018, 11:52 PM

Remove R_X86_64_GOTOFF64 as grimar@ suggested

MaskRay updated this revision to Diff 148704.May 25 2018, 11:57 PM

Remove remaining R_X86_64_GOTOFF64

MaskRay edited the summary of this revision. (Show Details)May 25 2018, 11:57 PM

I am not sure in this patch honestly. I'll add Peter, the author of D34355/D44259.

You wrote:

Currently in X86 and X86_64, when _GLOBAL_OFFSET_TABLE_ is used, there can be a mismatch of its value and the end of .got

That should not be important I think. As far I understand, _GLOBAL_OFFSET_TABLE_[0] can point either to .got or to .got.plt (depending on target) and used by glibc
to find link time address of _DYNAMIC (see https://bugs.llvm.org/show_bug.cgi?id=36555).
Does not seem we should assume that _GLOBAL_OFFSET_TABLE_ is the end of .got or not the end of .got.

It does not seem correct to me to resolve relocations relative to _GLOBAL_OFFSET_TABLE_ generally. There is an issue with resolving
relocations against particular _GLOBAL_OFFSET_TABLE_ symbol, but I am not sure that fix should affect anything else.

ELF/InputSection.cpp
501 ↗(On Diff #148704)

What is your plan to fix this FIXME? It does not seem nice approach to add weak _GLOBAL_OFFSET_TABLE_ symbols to test cases
which affects the calculation.

test/ELF/got-i386.s
8 ↗(On Diff #148704)

Why you dropped -NEXT?

37 ↗(On Diff #148704)

Please do minimal changes to test cases. It was no need to drop all these lines in this patch I believe.

41 ↗(On Diff #148704)

You should update these calculations and keep them instead of dropping.

test/ELF/x86-64-reloc-shared.s
17 ↗(On Diff #148704)

What I see is that llvm-mc produce following relocations here for me:

000000000003  000300000002 R_X86_64_PC32     0000000000000000 _GLOBAL_OFFSET_TABLE_ - 4
000000000009  00030000001d R_X86_64_GOTPC64  0000000000000000 _GLOBAL_OFFSET_TABLE_ + 2

So R_X86_64_GOTPC32 seems to be uncovered be this test.

grimar edited reviewers, added: psmith; removed: espindola.May 26 2018, 3:46 AM
MaskRay updated this revision to Diff 148725.May 26 2018, 10:40 AM
MaskRay marked 3 inline comments as done.

Keep lines in old tests

MaskRay added inline comments.May 26 2018, 10:45 AM
test/ELF/got-i386.s
41 ↗(On Diff #148704)

I kept // 0x13000 (bar) - 0x12000 (_GLOBAL_OFFSET_TABLE_) = 4096.

test/ELF/relocation-i686.s
3 ↗(On Diff #147652)

In case the pathname of %t2.o is too long in some environment and makes DT_NEEDED of %t2 longer and changes addresses of other fields.

test/ELF/x86-64-reloc-shared.s
17 ↗(On Diff #148704)

R_X86_64_GOTPC32 is a special case of R_X86_64_PC32 where the symbol is _GLOBAL_OFFSET_TABLE_. GNU as emits R_X86_64_GOTPC32 but llvm-mc emits R_X86_64_PC32. I have also tested GNU as

MaskRay marked an inline comment as done.May 26 2018, 11:39 AM
MaskRay added inline comments.
ELF/InputSection.cpp
501 ↗(On Diff #148704)

Changed the description. I think the usage of InX::Got->getVA() + InX::Got->getSize() is incorrect. We should use either:

  • ElfSym::GlobalOffsetTable. but it can be null when _GLOBAL_OFFSET_TABLE_ is not explicitly referenced.
  • If GlobalOffsetTable is null, use InX::Got or InX::GotPlt depending on GotBaseSymInGotPlt. We can avoid this if we scan relocations earlier and decide to emit GlobalOffsetTable.

To properly address this I also need to study the TLS stuff. This revision is already sort of complicated, I want to keep the changes in a future revision

grimar added inline comments.May 28 2018, 4:02 AM
ELF/InputSection.cpp
501 ↗(On Diff #148704)

If GlobalOffsetTable is null, use InX::Got or InX::GotPlt depending on GotBaseSymInGotPlt. We can avoid this if we scan relocations earlier and decide to emit GlobalOffsetTable.

I think GlobalOffsetTable should never be null here and scanning earlier seems to be the correct way to me. I am not sure if there will be problems or it is easy to do.
Because of that, I think it would be better to see the full patch. It should reveal the whole picture and get rid of the need to
add symbols to the existent test cases.

test/ELF/got-i386.s
41 ↗(On Diff #148704)

The original code contains 3 code lines and 3 comments, one for each:

  movl $1, bar@GOTOFF(%ecx)
  movl $2, obj@GOTOFF(%ecx)
  movl $3, obj+5@GOTOFF(%ecx)

// 0x12000 - 0 = addr(.got) = 0x12000
// 0x1200A - 10 = addr(.got) = 0x12000
// 0x1200A + 5 - 15 = addr(.got) = 0x12000

You dropped 2/3 of comments. Please don't do that.

test/ELF/i386-gotoff-shared.s
24 ↗(On Diff #148725)

0x2000 - 0x1000 != -4096

test/ELF/relocation-i686.s
3 ↗(On Diff #147652)

This sounds like an independent change for me.

test/ELF/x86-64-reloc-shared.s
17 ↗(On Diff #148704)

If llvm-mc is unable to generate inputs for the test, you can use yaml2obj to craft objects.
(For example, see ELF\invalid\invalid-relocation-x64.test)

But currently the R_X86_64_GOTPC32 is untested by your test case and that is not OK.

grimar added a comment.EditedMay 28 2018, 8:20 AM

Ok. After additional investigation (you mentioned TLS and I found a good sample for test),
I think we do the correct thing now.

I investigated pr20308 sample from gold. Below is asm code + output produced by gold as
well as relocations emitted for inputs.

.text
.p2align 4,,15

.globl set_gd
.type set_gd, @function
set_gd:
pushl %ebx #push %ebx
nop #nop
movl (%esp), %ebx #mov (%esp),%ebx
addl $_GLOBAL_OFFSET_TABLE_, %ebx #add $0x199f,%ebx, R_386_GOTPC
subl $8, %esp #sub $0x8,%esp
leal gd@tlsgd(%ebx), %eax #lea -0x10(%ebx),%eax, R_386_TLS_GD
leal gd2@tlsgd(%ebx), %eax #lea -0x8(%ebx),%eax, R_386_TLS_GD
call ___tls_get_addr@PLT #call 510 <___tls_get_addr@plt>, R_386_PLT32
nop #nop
movl 16(%esp), %edx #mov 0x10(%esp),%edx
movl %edx, (%eax) #mov %edx,(%eax)
addl $8, %esp #add $0x8,%esp
popl %ebx #pop %ebx
ret #ret
.size set_gd, .-set_gd

_GLOBAL_OFFSET_TABLE_ has value 00001ff4:

21: 00001ff4 24 OBJECT LOCAL HIDDEN 22 _GLOBAL_OFFSET_TABLE_

It is at the end of .got (equal to .got.plt):

[Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [21] .got              PROGBITS        00001fd0 000fd0 000024 00  WA  0   0  4
  [22] .got.plt          PROGBITS        00001ff4 000ff4 000020 00  WA  0   0  4

As you can see, _GLOBAL_OFFSET_TABLE_ is used for calculating indexes in
.got for TLS symbols (2 slots per symbol for TLS GD mode):

addl $_GLOBAL_OFFSET_TABLE_, %ebx #add $0x199f,%ebx, R_386_GOTPC
...
leal gd@tlsgd(%ebx), %eax #lea -0x10(%ebx),%eax, R_386_TLS_GD
leal gd2@tlsgd(%ebx), %eax #lea -0x8(%ebx),%eax, R_386_TLS_GD

Relocations for them are expected to be always relative to the end of .got, and not start of .got.plt.
Otherwise, the loader would try to use .got.plt slots instead of allocated pairs and would fail (for LLD case, since we do not have .got.plt right after .got now).

That returns me to what I said initially:

It does not seem correct to me to resolve relocations relative to _GLOBAL_OFFSET_TABLE_ generally. There is an issue with resolving
relocations against particular _GLOBAL_OFFSET_TABLE_ symbol, but I am not sure that fix should affect anything else.

One of the way I see how to solve it was suggested/described here: https://bugs.llvm.org/show_bug.cgi?id=36555#c19

Hello,

Apologies for the delay in responding, Monday was a public holiday in the UK. I'm not sure if I have much to add over what has been already said. My understanding from when I looked at pr36555:

  • Each target has a convention on where _GLOBAL_OFFSET_TABLE_ is placed.
  • glibc uses _GLOBAL_OFFSET_TABLE_[0] to find where it has been loading, it expects the link time address of _DYNAMIC. This seems to be the case for all targets.
  • glibc reserves entries in the .got.plt to support lazy loading, these seem to be accessed via the dynamic tag DT_PLTGOT rather than relative to _GLOBAL_OFFSET_TABLE_ despite the comment mentioning _GLOBAL_OFFSET_TABLE_[1] and _GLOBAL_OFFSET_TABLE_[2], I suspect that some time ago these used _GLOBAL_OFFSET_TABLE_[0] but was at some point switched to .got.plt[1] and .got.plt[2] with .got.plt[0] possibly used for _GLOBAL_OFFSET_TABLE_[0] if the target sets _GLOBAL_OFFSET_TABLE_ to .got.plt.
  • Some targets have a convention of loading the value of _GLOBAL_OFFSET_TABLE_ and accessing variables using some offset from the _GLOBAL_OFFSET_TABLE_.
  • LLD initially set _GLOBAL_OFFSET_TABLE_ to 0 and defined some of the relocation expressions from the end of the .got.
    • This works for most programs as the precise value of _GLOBAL_OFFSET_TABLE_ isn't used, as long as the relocations that are calculated relative to it come out consistently. This obviously doesn't work for glibc as it uses _GLOBAL_OFFSET_TABLE_[0] directly.

To fix this problem I think that we need two things:

  • _GLOBAL_OFFSET_TABLE_[0] needs to be able to contain the link time address of _DYNAMIC.
  • The relocations that programs use that are explicitly or implicitly relative to _GLOBAL_OFFSET_TABLE_ need to be consistent with where we place _GLOBAL_OFFSET_TABLE_.
    • Currently the _FROM_END relocations are defined to assume _GLOBAL_OFFSET_TABLE_ is set to the end of the .got section. If we've placed the _GLOBAL_OFFSET_TABLE_ somewhere else then that relocation needs redefining to where we have actually placed it.

I can think of two possible fixes that satisfy the conditions:

  • Keep the _FROM_END relocations and define _GLOBAL_OFFSET_TABLE_ to be the end of the .got section on x86, extend the .got by one word so that _GLOBAL_OFFSET_TABLE_[0] can contain the link time address of _DYNAMIC. This sounds a bit hacky to me, but I think it would work.
  • Redefine the relocations that use _FROM_END to use the location of the _GLOBAL_OFFSET_TABLE_ symbol (or where we have defined it to be) rather than hard code the end of the .got section.

Personally I'd favour the second approach as it looks like the _FROM_END relocations are a vestige of when _GLOBAL_OFFSET_TABLE_ was set to 0 and the value was never used. I'm not expert in x86 though so it would be best to get as many opinions as possible here.

ELF/InputSection.cpp
501 ↗(On Diff #148704)

I think that _GLOBAL_OFFSET_TABLE_ should not be null here in a well formed program (by convention rather than any specific wording in the ABI). It will be possible to create an example with assembly but I don't think that there would be any expectation that it would work. I think it would be better to error rather than make an arbitrary choice of value.

test/ELF/x86-64-reloc-shared.s
17 ↗(On Diff #148704)

Arm has a similar discrepancy between GNU as and llvm-mc. GNU as will use R_ARM_BASE_PREL if there is a relative relocation to _GLOBAL_OFFSET_TABLE_, this is useful as the relocation doesn't depend on the value of _GLOBAL_OFFSET_TABLE_. I submitted https://reviews.llvm.org/D46319 to see if I could get this fixed for Arm but I didn't get any response for it.

Thanks, Peter!

I can think of two possible fixes that satisfy the conditions:

  • Keep the _FROM_END relocations and define _GLOBAL_OFFSET_TABLE_ to be the end of the .got section on x86, extend the .got by one word so that _GLOBAL_OFFSET_TABLE_[0] can contain the link time address of _DYNAMIC. This sounds a bit hacky to me, but I think it would work.
  • Redefine the relocations that use _FROM_END to use the location of the _GLOBAL_OFFSET_TABLE_ symbol (or where we have defined it to be) rather than hard code the end of the .got section.

Personally I'd favour the second approach as it looks like the _FROM_END relocations are a vestige of when _GLOBAL_OFFSET_TABLE_ was set to 0 and the value was never used. I'm not expert in x86 though so it would be best to get as many opinions as possible here.

I am not sure that the second way should work (if I understand everything correctly).
For x86 as you know we define _GLOBAL_OFFSET_TABLE_ in the .got.plt[0] now.
Then for example if we have the following TLS code:

addl $_GLOBAL_OFFSET_TABLE_, %ebx #add    $0x199f,%ebx, R_386_GOTPC
subl $8, %esp                                         #sub    $0x8,%esp
leal gd@tlsgd(%ebx), %eax                    #lea    -0x10(%ebx),%eax, R_386_TLS_GD

it expects the first line to be resolved to the end of .got (to the GOT base in ABI terminology).
If we redefine the relocations that use _FROM_END to use the location of the _GLOBAL_OFFSET_TABLE_ (I think that is what this patch tried to do),
then it will not help us I think as we will end up with the address of .got.plt at the first line. And the code will fail to find a proper TLS slot as a result since .got.plt
can be far away from .got in LLD.

So we seem to have 2 problems actually:

  1. The current code does not relocate correctly and does not produce valid value when_GLOBAL_OFFSET_TABLE_ symbol is involved. (That would be fixed with second way).
  2. _GLOBAL_OFFSET_TABLE_ is expected by existent code to point to GOT base (end of .got). (Second way does not help here) And at the same time _GLOBAL_OFFSET_TABLE_ [0] should contain _DYNAMIC for glibc.

These above issues should be both solved with the first way (writing _DYNAMIC at the last .got slot and pointing _GLOBAL_OFFSET_TABLE_ to that slot).
If we do that then we probably will probably need to make _FROM_END use the location of the _GLOBAL_OFFSET_TABLE still for nice code, but it has to be in .got first of all I think.

test/ELF/x86-64-reloc-shared.s
17 ↗(On Diff #148704)

Yeah, I did not mean we need to fix llvm-mc, but at least I think we should test the particular relocation with use of yaml2obj then.

@MaskRay, do you have a sample app that does not work for you and you are trying to fix? I started to wonder what exactly the real-life use case that needs to be fixed here :)
Particularly how _GLOBAL_OFFSET_TABLE_ is used.

I am not sure that the second way should work (if I understand everything correctly).
For x86 as you know we define _GLOBAL_OFFSET_TABLE_ in the .got.plt[0] now.
Then for example if we have the following TLS code:

addl $_GLOBAL_OFFSET_TABLE_, %ebx #add    $0x199f,%ebx, R_386_GOTPC
subl $8, %esp                                         #sub    $0x8,%esp
leal gd@tlsgd(%ebx), %eax                    #lea    -0x10(%ebx),%eax, R_386_TLS_GD

it expects the first line to be resolved to the end of .got (to the GOT base in ABI terminology).
If we redefine the relocations that use _FROM_END to use the location of the _GLOBAL_OFFSET_TABLE_ (I think that is what this patch tried to do),
then it will not help us I think as we will end up with the address of .got.plt at the first line. And the code will fail to find a proper TLS slot as a result since .got.plt
can be far away from .got in LLD.

I think that the raw calculation would work if R_386_TLS_GD (RelExpr R_TLSGD) were defined to be something like InX::Got->getGlobalDynAddr(Sym) + A - ("VA of where _GLOBAL_OFFSET_TABLE_ is defined"). Instead of InX::Got->getGlobalDynOffset(Sym) + A - InX::Got->getSize();. I think that you are right that this distance will be larger if _GLOBAL_OFFSET_TABLE_ is defined to be base of .got.plt, however isn't the immediate 32-bits in size? I don't know of too many programs that would have that large a separation of .got or .got.plt but I suppose it is possible. Hope I haven't misunderstood as my x86 asm isn't good.

grimar added a comment.EditedMay 29 2018, 7:54 AM

I am not sure that the second way should work (if I understand everything correctly).
For x86 as you know we define _GLOBAL_OFFSET_TABLE_ in the .got.plt[0] now.
Then for example if we have the following TLS code:

addl $_GLOBAL_OFFSET_TABLE_, %ebx #add    $0x199f,%ebx, R_386_GOTPC
subl $8, %esp                                         #sub    $0x8,%esp
leal gd@tlsgd(%ebx), %eax                    #lea    -0x10(%ebx),%eax, R_386_TLS_GD

it expects the first line to be resolved to the end of .got (to the GOT base in ABI terminology).
If we redefine the relocations that use _FROM_END to use the location of the _GLOBAL_OFFSET_TABLE_ (I think that is what this patch tried to do),
then it will not help us I think as we will end up with the address of .got.plt at the first line. And the code will fail to find a proper TLS slot as a result since .got.plt
can be far away from .got in LLD.

I think that the raw calculation would work if R_386_TLS_GD (RelExpr R_TLSGD) were defined to be something like InX::Got->getGlobalDynAddr(Sym) + A - ("VA of where _GLOBAL_OFFSET_TABLE_ is defined"). Instead of InX::Got->getGlobalDynOffset(Sym) + A - InX::Got->getSize();.

Looks a bit tricky for me honestly.

I think that you are right that this distance will be larger if _GLOBAL_OFFSET_TABLE_ is defined to be base of .got.plt, however isn't the immediate 32-bits in size? I don't know of too many programs that would have that large a separation of .got or .got.plt >but I suppose it is possible. Hope I haven't misunderstood as my x86 asm isn't good.

I do not know.

I wonder if the next would be the correct way to solve this:

As far I understand, _GLOBAL_OFFSET_TABLE_ seems intended to be used with special types of relocations (this needs clarification).
R_X86_64_GOTPC64/R_X86_64_GOTPC32 and you mentioned R_ARM_BASE_PREL for arm, for example (your D46319 seems to make sense to me)

It seems that if we could know that symbol is _GLOBAL_OFFSET_TABLE_, we could always resolve relocation relative to the end of .got.
For x86 seems llvm-mc emits R_X86_64_PC32/R_X86_64_GOTPC64 and gas emits R_X86_64_GOTPC32/R_X86_64_GOTPC64 if _GLOBAL_OFFSET_TABLE_ is used.

If llvm-mc would emit R_X86_64_GOTPC32 too that we probably could do nothing here and continue resolving such relocations relative to the .got end.
I am not sure about the other targets, but if they also have special relocations for _GLOBAL_OFFSET_TABLE_, that could help to resolve this.

Or probably we could special case relocating _GLOBAL_OFFSET_TABLE_ symbol for now.

x86_64 ABI uses GOTPC32/GOTPC64 relocations for _GLOBAL_OFFSET_TABLE_:
(p40 https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf)

So I think llvm-mc need fix and the solution to support R_X86_64_GOTPC{32,64} is:

  1. Make llvm-mc to emit R_X86_64_GOTPC32 instead of R_X86_64_PC32 when _GLOBAL_OFFSET_TABLE_ is involved.
  2. Teach LLD about type (R_GOTONLY_PC_FROM_END) of new relocations R_X86_64_GOTPC32/R_X86_64_GOTPC64 (X86_64.cpp part of this patch). Do not change the logic in ELF/InputSection.cpp or anything else, resolve these relocations as we already do, against the end of .got.

I wonder if the next would be the correct way to solve this:

As far I understand, _GLOBAL_OFFSET_TABLE_ seems intended to be used with special types of relocations (this needs clarification).
R_X86_64_GOTPC64/R_X86_64_GOTPC32 and you mentioned R_ARM_BASE_PREL for arm, for example (your D46319 seems to make sense to me)

Given that in several processor specific ABIs the _GLOBAL_OFFSET_TABLE_ can have more than one legal location I think that it can only be used with relocations that are applied consistently with the location of _GLOBAL_OFFSET_TABLE_ chosen by the linker.

It seems that if we could know that symbol is _GLOBAL_OFFSET_TABLE_, we could always resolve relocation relative to the end of .got.
For x86 seems llvm-mc emits R_X86_64_PC32/R_X86_64_GOTPC64 and gas emits R_X86_64_GOTPC32/R_X86_64_GOTPC64 if _GLOBAL_OFFSET_TABLE_ is used.

If I understand you correctly this would mean that LLD would:

  • Resolve any relocation to _GLOBAL_OFFSET_TABLE_ as if it were defined at the end of the .got section.
  • Possibly define the _GLOBAL_OFFSET_TABLE_ symbol to be somewhere else, such as the start of the .got.plt so that _GLOBAL_OFFSET_TABLE_[0] could be used to contain the link time address of _DYNAMIC.

I think that this could work, but would require a lot of comments to explain why.

If llvm-mc would emit R_X86_64_GOTPC32 too that we probably could do nothing here and continue resolving such relocations relative to the .got end.
I am not sure about the other targets, but if they also have special relocations for _GLOBAL_OFFSET_TABLE_, that could help to resolve this.

Or probably we could special case relocating _GLOBAL_OFFSET_TABLE_ symbol for now.

I think if we change anything we should probably make it Target specific or risk breaking something. I know that when I made the mistake of moving _GLOBAL_OFFSET_TABLE_ to be the base of .got.plt on Arm I broke Chrome https://bugs.llvm.org/show_bug.cgi?id=36555#c16 . This would have worked if llvm-mc had used R_ARM_BASE_PREL rather than R_ARM_REL32 but sadly it doesn't so it needs _GLOBAL_OFFSET_TABLE_ to be the base of the .got output section.

grimar added a comment.EditedMay 29 2018, 9:50 AM

If llvm-mc would emit R_X86_64_GOTPC32 too that we probably could do nothing here and continue resolving such relocations relative to the .got end.
I am not sure about the other targets, but if they also have special relocations for _GLOBAL_OFFSET_TABLE_, that could help to resolve this.

Or probably we could special case relocating _GLOBAL_OFFSET_TABLE_ symbol for now.

I think if we change anything we should probably make it Target specific or risk breaking something. I know that when I made the mistake of moving _GLOBAL_OFFSET_TABLE_ to be the base of .got.plt on Arm I broke Chrome https://bugs.llvm.org/show_bug.cgi?id=36555#c16 . This would have worked if llvm-mc had used R_ARM_BASE_PREL rather than R_ARM_REL32 but sadly it doesn't so it needs _GLOBAL_OFFSET_TABLE_ to be the base of the .got output section.

Ok, so seems safe fix, for now, would be to make llvm-mc to emit R_X86_64_GOTPC32 for _GLOBAL_OFFSET_TABLE_ , as it does now for R_X86_64_GOTPC64.
That way we:

  1. Make llvm-mc output consistent with gas and sample from ABI.
  2. Isolate change to x86_64 target.
  3. Do not need to change anything significant in LLD, it will handle relocations to _GLOBAL_OFFSET_TABLE_ just like it already do (via R_GOTONLY_PC_FROM_END and R_GOTREL_FROM_END).

It still will not work correctly with objects built with current llvm-mc which use R_X86_64_PC32 for _GLOBAL_OFFSET_TABLE_,
but since LLD does not support R_X86_64_GOTPC32/R_X86_64_GOTPC64 at all yet, and nobody yet complained/reported bug,
it hopefully should not be a huge problem. Also, such objects currently do not seem to be fully ABI compatible anyways.

MaskRay updated this revision to Diff 148967.May 29 2018, 12:57 PM

rebase && update tests

MaskRay marked 3 inline comments as done.May 29 2018, 1:00 PM
MaskRay added inline comments.
test/ELF/relocation-i686.s
3 ↗(On Diff #147652)

Do you mean I should commit this change (-soname t2.so) in a different revision?

MaskRay added inline comments.May 29 2018, 1:06 PM
ELF/InputSection.cpp
501 ↗(On Diff #148704)

We are on the same page. I said we could "scan relocations earlier" then _GLOBAL_OFFSET_TABLE_ would not be null and that is why I left a "FIXME" there.

But doing that is probably too difficult for me. The current change is imperfect but is compatible with other places. When _GLOBAL_OFFSET_TABLE_ is null, the code makes an arbitrary choice (the same as the old code does): the end of the .got section.

MaskRay updated this revision to Diff 148978.May 29 2018, 2:00 PM

Add some comments why have to use:

return (ElfSym::GlobalOffsetTable
            ? ElfSym::GlobalOffsetTable->getVA()
            : InX::Got->getVA() + InX::Got->getSize()) +
       A - P;

and when we can get rid of it.

MaskRay added inline comments.May 29 2018, 2:06 PM
ELF/InputSection.cpp
501 ↗(On Diff #148704)

Added some comments about

return (ElfSym::GlobalOffsetTable
            ? ElfSym::GlobalOffsetTable->getVA()
            : InX::Got->getVA() + InX::Got->getSize()) +
       A - P;

It is very difficult to scan relocations earlier

ElfSym::GlobalOffsetTable is defined in

https://github.com/llvm-mirror/lld/tree/got2/ELF/Driver.cpp#L1254

addReservedSymbols();

scanRelocations is transitively called in Writer<ELFT>::finalizeSections

createSyntheticSections and a bunch of other handlers are called in between and some of them checks ElfSym::GlobalOffsetTable.

I have left a comment here why the choice albeit imperfect, does not cause trouble in practice: because the two relocations S+A-GOT and GOT+A-P counteract each other.
So an arbitrary address (end of .got) works.

MaskRay marked 4 inline comments as done.May 29 2018, 2:24 PM
MaskRay added inline comments.
test/ELF/x86-64-reloc-shared.s
17 ↗(On Diff #148704)

Thanks for the suggestion. I learned yaml2obj and added test/ELF/x86-64-reloc-shared2.test

MaskRay updated this revision to Diff 148980.May 29 2018, 2:29 PM
MaskRay marked an inline comment as done.

description of yaml2obj

grimar added inline comments.May 29 2018, 3:26 PM
test/ELF/i386-gotoff-shared.s
26 ↗(On Diff #148980)

I think there were too many different comments from me and probably them confused things, my apologies.
(It took some time to realize/refresh how it should work and how we can fix for me either.)

To summarize:
This new calculation is incorrect. Original was. The explanation is here:
https://reviews.llvm.org/D47098#1114071

The solution I am suggesting to implement R_X86_64_GOTPC32/R_X86_64_GOTPC64 is in the latest reply:
https://reviews.llvm.org/D47098#1114909
(the main point is that with that we will not need to change the code in getRelocTargetVA it seems)

test/ELF/x86-64-reloc-shared.s
6 ↗(On Diff #148980)

The same. I believe it is incorrect because of the same reasons as for i386.

If you would not add code to InputSection.cpp, it would resolve to:

0000000000001000 <R_X86_64_GOTPC64>:
    1000:	49 bb 70 20 00 00 00 	mov    $0x2070,%r11

0x1000 + 0x2070 = 0x3070 = address of .got end.
_GLOBAL_OFFSET_TABLE_ in symbol table has value of 0x2000, but it is should not be a real problem.

Why do you want to resolve it to .got.plt? Which code would use it and for what?

MaskRay added a comment.EditedMay 29 2018, 5:54 PM

Sorry I did not find the your discussions (submerged in comments :) I'm still not very clear the TLS stuff, but I'll grok these pieces and leave only the X86 changes (removing changes to InputSection.cpp) Meanwhile, created D47507

MaskRay updated this revision to Diff 149136.May 30 2018, 9:09 AM

Keep only R_X86_64_GOTPC{32,64}

MaskRay retitled this revision from [ELF] Make R_GOTONLY_PC_FROM_END R_GOTREL_FROM_END relative to _GLOBAL_OFFSET_TABLE_ instead of end of .got to [ELF] Support R_X86_64_GOTPC{32,64}.May 30 2018, 9:09 AM
MaskRay edited the summary of this revision. (Show Details)

This revision is independent from D47507 though only with both we can make the following assembly sequence work no matter the value of _GLOBAL_OFFSET_TABLE_:

// -mcmodel=medium
// extern long _DYNAMIC[] __attribute__((visibility("hidden")));
// long* foo() {
//   return _DYNAMIC;
// }

	leaq	_GLOBAL_OFFSET_TABLE_(%rip), %rdx
	movabsq	$_DYNAMIC@GOTOFF, %rax

Does this cleaned-up revision look good? (The changes to relocation calculation have been removed)

This revision is independent from D47507 though only with both we can make the following assembly sequence work no matter the value of _GLOBAL_OFFSET_TABLE_:

// -mcmodel=medium
// extern long _DYNAMIC[] __attribute__((visibility("hidden")));
// long* foo() {
//   return _DYNAMIC;
// }

	leaq	_GLOBAL_OFFSET_TABLE_(%rip), %rdx
	movabsq	$_DYNAMIC@GOTOFF, %rax

Does this cleaned-up revision look good? (The changes to relocation calculation have been removed)

That is true that this revision is technically independent, but personally, I would like to wait for D47507 landing because:

  1. We can use single asm test (and not yaml2obj) then.
  2. We can explicitly add a check for input objects to ensure them have relocations we expect. And so to verify we have no known

cases when LLD produce broken output.

test/ELF/x86-64-reloc-got.s
6 ↗(On Diff #149136)

This should have a proof. So you should dump a .got section to show that .got address + its size == 0x3070.
Please see another our tests, they have a lot of samples.

MaskRay updated this revision to Diff 150972.Jun 12 2018, 9:41 AM

yaml2obj -> .s as D47505 (use GOTPC32) has been committed

Friendly ping :)

https://reviews.llvm.org/D47507 has been commited so llvm-mc on leaq _GLOBAL_OFFSET_TABLE_(%rip), %r15 emits R_X86_64_GOTPC32 now (compatible with GNU as behavior).
The ugly yaml2obj test has been removed (thank for informing me of the existence of such utility)

MaskRay updated this revision to Diff 150973.Jun 12 2018, 9:48 AM

Proof of .got end

MaskRay marked an inline comment as done.Jun 12 2018, 9:48 AM
ruiu added a comment.Jun 12 2018, 11:31 AM

I applied this patch but the test failed. Can you fix?

ELF/Relocations.cpp
350 ↗(On Diff #150973)

clang-format

test/ELF/x86-64-reloc-got.s
11 ↗(On Diff #150973)

Let's avoid using the exact same name for both relocation type and function name. That's very confusing to humans when reading disassembly. Just fn1 and fn2 should suffice.

ruiu added inline comments.Jun 12 2018, 12:44 PM
test/ELF/x86-64-reloc-got.s
11 ↗(On Diff #150973)

Can you fix?

ruiu added inline comments.Jun 12 2018, 12:46 PM
test/ELF/x86-64-reloc-got.s
8 ↗(On Diff #150988)

Is this a stray comment? Can you remove?

MaskRay updated this revision to Diff 151008.Jun 12 2018, 1:20 PM

Update tests

MaskRay marked 3 inline comments as done.Jun 12 2018, 1:21 PM
ruiu accepted this revision.Jun 12 2018, 1:22 PM

LGTM

This revision is now accepted and ready to land.Jun 12 2018, 1:22 PM
This revision was automatically updated to reflect the committed changes.