This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Implemented optimization for R_X86_64_GOTPCREL relocation.
ClosedPublic

Authored by grimar on Dec 25 2015, 9:16 AM.

Download Raw Diff

Details

Reviewers

ruiu
• rafael

Commits

rG5c33b91bbe16: [ELF] - Implemented optimization for R_X86_64_GOTPCREL relocation.
rLLD270705: [ELF] - Implemented optimization for R_X86_64_GOTPCREL relocation.
rL270705: [ELF] - Implemented optimization for R_X86_64_GOTPCREL relocation.

Summary

System V Application Binary Interface AMD64 Architecture Processor Supplement Draft Version 0.99.8
(https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf, B.2 "B.2 Optimize GOTPCRELX Relocations")
introduces possible relaxations for R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX.

That patch implements the next relaxation:
mov foo@GOTPCREL(%rip), %reg => lea foo(%rip), %reg
and also opens door for implementing all other ones.

Implementation was suggested by Rafael Ávila de Espíndola with few additions and testcases by myself.

Diff Detail

Repository: rL LLVM

Event Timeline

grimar updated this revision to Diff 43635.Dec 25 2015, 9:16 AM

grimar retitled this revision from to [ELF] - Implemented optimization for R_X86_64_GOTPCREL relocation..

grimar updated this object.

grimar added reviewers: ruiu, • rafael.

grimar added subscribers: llvm-commits, grimar.

Do you also intend to also implement the other optimizations listed in the ABI?

ELF/Target.cpp
664 ↗	(On Diff #43635)	canBePreempted already checks this, no?
test/ELF/gotpc-relax-und-dso.s
24 ↗	(On Diff #43635)	Use {{.*}} instead of checking the bits.

Addressed review comments
Rebased

In D15779#329436, @rafael wrote:

Do you also intend to also implement the other optimizations listed in the ABI?

I have plan to implement all those which listed in "B.2 Optimize GOTPCRELX Relocations":
call *foo@GOTPCREL(%rip) -> nop call foo
call *foo@GOTPCREL(%rip) -> call foo nop
jmp *foo@GOTPCREL(%rip) -> jmp foo nop
mov foo@GOTPCREL(%rip), %reg -> lea foo(%rip), %reg (this patch implements it)

I have no any plans about any others optimizations.

ELF/Target.cpp
664 ↗	(On Diff #43635)	Looks like so, fixed.
test/ELF/gotpc-relax-und-dso.s
24 ↗	(On Diff #43635)	I leaved bits specially here. I think for patches that emit binary data it is reasonable to check it. For example the same instruction can be encoded in a different way. This one patch changes one byte and so I would like to check binary output as well as instructions generated.

One thing to keep in mind is the possibility of overflow (https://sourceware.org/bugzilla/show_bug.cgi?id=18591). No need to handle it now since llvm is also broken (pr 26208). I will try to fix the LLVM, at which point we should probably check for overflows in lld.

ELF/Target.cpp
698 ↗	(On Diff #45247)	Update the comment to say that we can do mov -> lea with R_X86_64_GOTPCREL, but we need R_X86_64_GOTPCRELX for the other optimizations
702 ↗	(On Diff #45247)	Could canBePreempted take care of the ifunc check?
955 ↗	(On Diff #45247)	You are passing -1 in here because the offset is not available, correct? Is there any point in ever passing the offset if we cannot pass it here?

grimar added inline comments.Jan 19 2016, 8:34 AM

ELF/Target.cpp
698 ↗	(On Diff #45247)	Ok, I can do that. I just supposed that since we are not going to implement any other ones before R_X86_64_GOTPCRELX then such comment is excessive. Current one just describes what we do, not sure if we want to list everything we will not do :)
702 ↗	(On Diff #45247)	I would not do it. For example we have the next code currently: if (!CanBePreempted && Body && isGnuIFunc<ELFT>(*Body)) Reloc = Target->getIRelativeReloc(); Currently we implemented ifunc for static linking case. But afaik gold/bfd supports dynamic case for which real CBP makes sence. And I am not sure we should cover such special type under clear for understanding canBePreempted() name.
955 ↗	(On Diff #45247)	I cannot pass it here, but I still do the check for that place inside canOptimizeGotPcRel(). I just assume that there is not possible to have negative buffer overflow from here. My point was: since we dont have R_X86_64_GOTPCRELX yet its better to do all possible additional checks. After implementation of R_X86_64_GOTPCRELX we probably can remove it (like we dont check overflows for TLS relaxations, assuming inputs are correct). Also I am not sure will we still perform relaxation for raw R_X86_64_GOTPCREL then or not (I quess better would be to stop support to be consistent with ABI, but if both gold/bfd do that then it is still safe I think).

In D15779#330019, @rafael wrote:

One thing to keep in mind is the possibility of overflow (https://sourceware.org/bugzilla/show_bug.cgi?id=18591). No need to handle it now since llvm is also broken (pr 26208). I will try to fix the LLVM, at which point we should probably check for overflows in lld.

That is interesting issue, thaks for info.

Could canBePreempted take care of the ifunc check?

I would not do it.
For example we have the next code currently:
if (!CanBePreempted && Body && isGnuIFunc<ELFT>(*Body))
      Reloc = Target->getIRelativeReloc();
Currently we implemented ifunc for static linking case. But afaik gold/bfd supports dynamic case for which real CBP makes sence. And I am not sure we should cover such special type under clear for understanding canBePreempted() name.

Good point. Thanks.

Comment at: ELF/Target.cpp:955
@@ -906,1 +954,3 @@
case R_X86_64_GOTPCREL:
+ if (S && canOptimizeGotPcRel(Type, *S, Loc, (uint64_t)-1))

+ optimizeGotPcRel(Loc);

rafael wrote:

You are passing -1 in here because the offset is not available, correct?
Is there any point in ever passing the offset if we cannot pass it here?

I cannot pass it here, but I still do the check for that place inside canOptimizeGotPcRel().
I just assume that there is not possible to have negative buffer overflow from here.
My point was: since we dont have R_X86_64_GOTPCRELX yet its better to do all possible additional checks. After implementation of R_X86_64_GOTPCRELX we probably can remove it (like we dont check overflows for TLS relaxations, assuming inputs are correct).

You would still get a buffer overflow. Say Off is 1. On the first call
you pass 1 and avoid the buffer access. On the second one you pass -1
and still access position -2.

The options I see are

Save the result of the call.
Propagate Off to RelocateOne
Don't check it.

I would go with 3 for now.

In D15779#330316, @rafael wrote:
Could canBePreempted take care of the ifunc check?

I would not do it.
For example we have the next code currently:
if (!CanBePreempted && Body && isGnuIFunc<ELFT>(*Body))
      Reloc = Target->getIRelativeReloc();
Currently we implemented ifunc for static linking case. But afaik gold/bfd supports dynamic case for which real CBP makes sence. And I am not sure we should cover such special type under clear for understanding canBePreempted() name.
Good point. Thanks.
Comment at: ELF/Target.cpp:955
@@ -906,1 +954,3 @@
case R_X86_64_GOTPCREL:
+ if (S && canOptimizeGotPcRel(Type, *S, Loc, (uint64_t)-1))

+ optimizeGotPcRel(Loc);

rafael wrote:

You are passing -1 in here because the offset is not available, correct?
Is there any point in ever passing the offset if we cannot pass it here?

I cannot pass it here, but I still do the check for that place inside canOptimizeGotPcRel().
I just assume that there is not possible to have negative buffer overflow from here.
My point was: since we dont have R_X86_64_GOTPCRELX yet its better to do all possible additional checks. After implementation of R_X86_64_GOTPCRELX we probably can remove it (like we dont check overflows for TLS relaxations, assuming inputs are correct).
You would still get a buffer overflow. Say Off is 1. On the first call
you pass 1 and avoid the buffer access. On the second one you pass -1
and still access position -2.

Yes, I know, as "overflow" I was mean that there would not be access violation in that place.
(It is not possible to get it here I believe because of lot of other data in front of buffer before relocations).
So the worst thing could happen is a possible read of some side data at second check. But at least one correct check would still be performed (first one).

The options I see are

Save the result of the call.

Propagate Off to RelocateOne

Don't check it.

I would go with 3 for now.

I am agree with that. Now I think that having partial checks is no better than absence of them here.

Lots of tests started to crash if I remove the "Off <= 2" check. I think that was the reason why I added that initially.

25> lld :: ELF/dynamic-reloc-weak.s
25> lld :: ELF/got.s
25> lld :: ELF/gotpc-relax-und-dso.s
25> lld :: ELF/local-got-shared.s
25> lld :: ELF/local-got.s
25> lld :: ELF/relocation.s
25> lld :: ELF/relro.s

For example ELF/local-got.s which has

_start:
	call bar@gotpcrel

So it looks would be better or to leave check as is for now or to add Off argument to RelocateOne.
I think we should check the buffer somehow to have correct solution. So I would add the argument.

In D15779#330411, @grimar wrote:

Lots of tests started to crash if I remove the "Off <= 2" check. I think that was the reason why I added that initially.

I`ll recheck the reasons of it and how to fix that tomorrow. Probably I am mistaken about that Off argument is needed. Will update the patch then.

This is not a comment for this particular patch, but in general, pieces of code for code relaxation are scattered to many places -- you have multiple calls of canXXX and call optimizeXXX at some place. It tend to hard to understand. Each patch doesn't increase complexity that much, but when they accumulate, they look being entangled.

Can we separate code relaxation from relocation application? I think that we can create a function that visits all relocations and rewrite code and possibly relocations to relax code, and call that function before applying relocations. Then when we are applying relocations, we don't need to think about code relaxation at all. Anyway, I'd like to find some way to reduce complexity of relocation application.

In D15779#330631, @ruiu wrote:

This is not a comment for this particular patch, but in general, pieces of code for code relaxation are scattered to many places -- you have multiple calls of canXXX and call optimizeXXX at some place. It tend to hard to understand. Each patch doesn't increase complexity that much, but when they accumulate, they look being entangled.

Can we separate code relaxation from relocation application? I think that we can create a function that visits all relocations and rewrite code and possibly relocations to relax code, and call that function before applying relocations. Then when we are applying relocations, we don't need to think about code relaxation at all. Anyway, I'd like to find some way to reduce complexity of relocation application.

I need to think about how possible to do that better. In general that looks for me like additional pass through all relocations and marking them or excluding from futher pass. Idea itself looks attractive for me, but I afraid of possible code duplication that might happen in that case.
Also do you mean you want to see that change before landing any other relaxation patch or its fine to make it right after current pending ones (there are two of them from my side now at reviews: http://reviews.llvm.org/D16201, http://reviews.llvm.org/D15779) ?

Addressed review comments.
Removed Off argument.
Added error message for case when relocation overflow happens during relaxation.

In D15779#330019, @rafael wrote:

One thing to keep in mind is the possibility of overflow (https://sourceware.org/bugzilla/show_bug.cgi?id=18591). No need to handle it now since llvm is also broken (pr 26208). I will try to fix the LLVM, at which point we should probably check for overflows in lld.

Regarding this I added error message to relocateOne() and that have 2 problems:

I just unsure how to find out the overflow and not convert mov->lea before we already applying relocations in relocateOne(). I need to find out the distance between symbol and instructions really early to use canOptimizeGotPcRel() method for avoiding create of GOT entries and so on.
I did not include the test for it, because it creates 4gb file, I am not sure that is good for test. But I used sample from https://sourceware.org/bugzilla/show_bug.cgi?id=18591 to check that error is displaying.

• rafael added inline comments.Jan 20 2016, 6:37 AM

ELF/Target.cpp
782 ↗	(On Diff #45368)	This highlights Rui's comment that we are spreading optimizations out too much. A relocation being relative or not should really not depend on it being optimized. If it is optimized, some other relocation takes its place. Can you just check for optimization at the caller and pass the new relocation value?

Or, we should try not computing the size upfront:

It might be possible to avoid this by outputting the file with write:
* Write the allocated output sections, computing addresses.
* Apply relocations, recording which ones require a dynamic reloc.
* Write the dynamic relocations.
// * Write the rest of the file.

emaste added a subscriber: emaste.Feb 11 2016, 2:20 PM

X86_64TargetInfo::canOptimizeGotPcRel() needs to have check for "_DYNAMIC" symbol either (since we almost have it, http://reviews.llvm.org/D17607) + testcase.
Both gold/bfd does not apply such optimization for it.

That patch was old, so I:

Rebased it
Added _DYNAMIC symbol ignore.
Reformated the testcases.

Reimplemented to match updated relocations handling code.

Rebased to fit with the latest llvm and lld code.

Note: this version does not check the instuction op code and does not relax with common gotpcrel relocation.
I think it is save and correct now just to handle R_X86_64_REX_GOTPCRELX one for what this patch do.
This approach simplifies the code a little and should be fully compatible with ABI now.

• rafael added inline comments.May 18 2016, 12:20 PM

ELF/Target.cpp
513 ↗	(On Diff #57580)	Why is _DYNAMIC special? The other checks are not architecture specific and should be in architecture independent code.
537 ↗	(On Diff #57580)	I would probably have this return R_GOT_PC and then have a generic logic for "can we optimize this got use". This would be somewhat similar to the fact that we optimize plt acesses to local symbols.

Reimplemented in more generic way (addressed review comments).

ELF/Target.cpp
513 ↗	(On Diff #57580)	That is something directly mentioned in docs I think, but initially in binutils there also were no exception for _DYNAMIC, and they had to do that finally because it turns out ld.so uses it: https://sourceware.org/ml/binutils/2012-09/msg00000.html
537 ↗	(On Diff #57580)	Done, reimplemented in more generic way.

grimar added inline comments.May 19 2016, 3:55 AM

ELF/Target.cpp
529 ↗	(On Diff #57761)	Something not mentioned, I mean.

I am reworking this, it was not correct to remove opcode check (I was for some reason thinking that R_X86_64_REX_GOTPCRELX is generated specially for mov->lea relaxation),
so I plan to update this soon.

Restored removed earlier by mistake a check of opcode for mov->lea conversion.
Updated code and testcases to produce/relax R_X86_64_GOTPCRELX.

looking

• rafael added inline comments.May 23 2016, 3:47 PM

ELF/InputSection.h
67 ↗	(On Diff #58087)	It is a bit surprising to see this here. If the got access is relaxed, it result will not refer to a got entry, no?
ELF/Target.cpp
244 ↗	(On Diff #58087)	I am pretty sure this reference to "_DYNAMIC" is bogus. The email you posted refers to got[0], but we write that value directly. This code will never be used for that.
758 ↗	(On Diff #58087)	Every target would have to call this, no? Why not call it from target independent code?
762 ↗	(On Diff #58087)	It is not clear why you need this. relocateOne has a value check. Why you need one here?

• rafael added inline comments.May 23 2016, 4:36 PM

ELF/Writer.cpp
535 ↗	(On Diff #58087)	I don't think you can do an early return here. You should set Expr and let the function continue so we get the checks.

I decided to apply the patch to give it a better review.

Some of the issues I pointed out in the review (R_RELAX_GOT referring
to got entry) were the causes of other changes (unnecessary continue).

Also, the *GOTPCRELX relocations can always be relaxed if the symbol
is not ifunc or preemptable, no? It is just that it is not always a
mov->lea.

In any case, I have attached the result of simplifying your patch a
bit. Let me know what you think.

Cheers,
Rafael

t.diff10 KBDownload

In D15779#437390, @rafael wrote:

I decided to apply the patch to give it a better review.

Some of the issues I pointed out in the review (R_RELAX_GOT referring
to got entry) were the causes of other changes (unnecessary continue).

Thanks !

Also, the *GOTPCRELX relocations can always be relaxed if the symbol
is not ifunc or preemptable, no? It is just that it is not always a
mov->lea.

Sure, but this patch supports only mov->lea conversion. So for our case
I decided to check that instruction is exactly mov, because of next:

In any case, I have attached the result of simplifying your patch a
bit. Let me know what you think.

I think it looks better than mine, but it have an issue:
Imagine you have next code to relax:

.text
.globl foo
.type foo, @function
foo:
 nop

.text
.globl _start
 call	*foo@GOTPCREL(%rip)

Since your patch does not check the mov opcode before relaxing, it "relax" that to lea instead
of "nop call foo" or "call foo nop". That is because relaxGot() does not yet know about how to relax the other
instructions yet.

Cheers,
Rafael

t.diff10 KBDownload

If you dont mind, I`ll use your code as a base, will fix the issue above, add the testcase for it and update.
(I also plan to add other relaxations, but I wonder will it be better to do that in this patch or separatelly,
I`ll take a look how to do that cleaner, probably implementing all of them at once can help to simplify
the patch, probably not).

ELF/Target.cpp
762 ↗	(On Diff #58087)	It would write something like "relocation R_X86_64_PC32 is out of range". That does not imo provide enough info here about what really happened. Ideally we just should not do that relaxation if we know that overflow happens.

In D15779#437390, @rafael wrote:

I decided to apply the patch to give it a better review.

Some of the issues I pointed out in the review (R_RELAX_GOT referring
to got entry) were the causes of other changes (unnecessary continue).

Also, the *GOTPCRELX relocations can always be relaxed if the symbol
is not ifunc or preemptable, no? It is just that it is not always a
mov->lea.

In any case, I have attached the result of simplifying your patch a
bit. Let me know what you think.

Cheers,
Rafael

t.diff10 KBDownload

Rafael, returning to this, I have few thoughts now.

Latest x64 ABI describes 3 possible groups of relaxations atm (https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI):

Convert call and jump
Convert mov->lea
Convert Test and Binop Convert memory operand of test and binop into

immediate operand, where binop is one of adc, add, and, cmp, or,
sbb, sub, xor instructions, when position-independent code is disabled.

If there was no 3, we probably would be able to leave R_RELAX_GOT_PC expression assigning
logic as is and just teach relaxGot() to relax all of them at once. If do that at once, that would help to avoid
the op code checks anywhere except relaxGot() itself.

But 3 says that depending on PIC for a specific set of instruction relaxation is possible or not.

That probably does not leave us a chance for above and solution I see is introduce Target method
canGotBeRelaxed() or something, which will check the instructions opcodes, like:

bool canGotBeRelaxed() {
if (call or jump or move opcodes) 
  return true;
if (Test or given binop opcodes)
  return !PIC:
return false; //I guess it is possible to have instructions that can not be relaxed even if RELX relaxation is present ?
}

and call it from adjustExpr(). What do you think ?

Rafael, returning to this, I have few thoughts now.

Latest x64 ABI describes 3 possible groups of relaxations atm (https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI):

Convert call and jump

Convert mov->lea

Convert Test and Binop Convert memory operand of test and binop into

immediate operand, where binop is one of adc, add, and, cmp, or,
sbb, sub, xor instructions, when position-independent code is disabled.

If there was no 3, we probably would be able to leave R_RELAX_GOT_PC expression assigning
logic as is and just teach relaxGot() to relax all of them at once. If do that at once, that would help to avoid
the op code checks anywhere except relaxGot() itself.

But 3 says that depending on PIC for a specific set of instruction relaxation is possible or not.

That probably does not leave us a chance for above and solution I see is introduce Target method
canGotBeRelaxed() or something, which will check the instructions opcodes, like:
bool canGotBeRelaxed() {
if (call or jump or move opcodes)
  return true;
if (Test or given binop opcodes)
  return !PIC:
return false; //I guess it is possible to have instructions that can not be relaxed even if RELX relaxation is present ?
}
and call it from adjustExpr(). What do you think ?

Yes, I missed that case 3 is non pic only :-(

Given that we don't want to complicate the interface of getRelExpr, I
think that, relative to my patch, what is needed is

Delete R_RELAXABLE_GOT_PC, getRelExpr returns just R_GOT_PC.
In Writer.cpp, the code

if (Expr == R_RELAXABLE_GOT_PC)

becomes

if (Expr == R_GOT_PC && Target->canRelaxGot(....))

Given that we will need the predicate, I would leave this patch doing
just mov -> lea for now.

Cheers,
Rafael

In D15779#438263, @rafael wrote:

Given that we will need the predicate, I would leave this patch doing
just mov -> lea for now.

Cheers,
Rafael

Was this about this patch only or it means I should suspend working on
other relaxations for following patches either ?

George.

In D15779#438263, @rafael wrote:

Given that we will need the predicate, I would leave this patch doing
just mov -> lea for now.

Cheers,
Rafael

Ah, I misread. Please ignore my previous comment.

Updated code according to Rafael's directions.
Updated testcase to check few other instructions (that them are not converted to lea).

Rebased to top

LGTM with nits.

Thank you so much for caring this over some many redesigns. I am very happy to see that this fits nicely in the existing infrastructure.

ELF/Relocations.cpp
345 ↗	(On Diff #58404)	Don't make uintX_t a template parameter. You can just use ELFT::uint
490 ↗	(On Diff #58404)	Use the existing variable Buf.
ELF/Target.cpp
531 ↗	(On Diff #58404)	Now this is just moving the label, please leave it in the original location.
744 ↗	(On Diff #58404)	Converting

This revision is now accepted and ready to land.May 25 2016, 7:10 AM

LGTM wit nits. See the webpage for the nits, phab is not sending
emails again :-(

In D15779#439090, @rafael wrote:

LGTM with nits.

Thank you so much for caring this over some many redesigns. I am very happy to see that this fits nicely in the existing infrastructure.

Thank you for all that reviews and final code suggestion. Btw today is exactly 5 month from first time this was posted :)

Closed by commit rL270705: [ELF] - Implemented optimization for R_X86_64_GOTPCREL relocation. (authored by grimar). · Explain WhyMay 25 2016, 7:38 AM

This revision was automatically updated to reflect the committed changes.

grimar mentioned this in D20622: [ELF] - Added support for jmp/call relaxations when R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX are used..May 25 2016, 8:01 AM

grimar mentioned this in rL270721: [ELF] - Added support for jmp/call relaxations when….May 25 2016, 9:57 AM

sgraenitz mentioned this in D89795: [jitlink][ELF] Add zero-fill blocks for symbols in section SHN_COMMON.Oct 24 2020, 6:04 AM

Revision Contents

Path

Size

lld/

trunk/

ELF/

4 lines

1 line

18 lines

3 lines

27 lines

test/

ELF/

Inputs/

gotpc-relax-und-dso.s

4 lines

gotpc-relax-und-dso.s

72 lines

gotpc-relax.s

76 lines

Diff 58416

lld/trunk/ELF/InputSection.cpp

Show First 20 Lines • Show All 244 Lines • ▼ Show 20 Lines	if (Out<ELF64BE>::Opd) {
uint64_t OpdEnd = OpdStart + Out<ELF64BE>::Opd->getSize();		uint64_t OpdEnd = OpdStart + Out<ELF64BE>::Opd->getSize();
bool InOpd = OpdStart <= SymVA && SymVA < OpdEnd;		bool InOpd = OpdStart <= SymVA && SymVA < OpdEnd;
if (InOpd)		if (InOpd)
SymVA = read64be(&Out<ELF64BE>::OpdBuf[SymVA - OpdStart]);		SymVA = read64be(&Out<ELF64BE>::OpdBuf[SymVA - OpdStart]);
}		}
return SymVA - P;		return SymVA - P;
}		}
case R_PC:		case R_PC:
		case R_RELAX_GOT_PC:
return Body.getVA<ELFT>(A) - P;		return Body.getVA<ELFT>(A) - P;
case R_PAGE_PC:		case R_PAGE_PC:
return getAArch64Page(Body.getVA<ELFT>(A)) - getAArch64Page(P);		return getAArch64Page(Body.getVA<ELFT>(A)) - getAArch64Page(P);
}		}
llvm_unreachable("Invalid expression");		llvm_unreachable("Invalid expression");
}		}

// This function applies relocations to sections without SHF_ALLOC bit.		// This function applies relocations to sections without SHF_ALLOC bit.
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	for (const Relocation &Rel : Relocations) {
uintX_t A = Rel.Addend;		uintX_t A = Rel.Addend;

uintX_t AddrLoc = OutSec->getVA() + Offset;		uintX_t AddrLoc = OutSec->getVA() + Offset;
RelExpr Expr = Rel.Expr;		RelExpr Expr = Rel.Expr;
uint64_t SymVA = SignExtend64<Bits>(		uint64_t SymVA = SignExtend64<Bits>(
getSymVA<ELFT>(Type, A, AddrLoc, Rel.Sym, BufLoc, File, Expr));		getSymVA<ELFT>(Type, A, AddrLoc, Rel.Sym, BufLoc, File, Expr));

switch (Expr) {		switch (Expr) {
		case R_RELAX_GOT_PC:
		Target->relaxGot(BufLoc, SymVA);
		break;
case R_RELAX_TLS_IE_TO_LE:		case R_RELAX_TLS_IE_TO_LE:
Target->relaxTlsIeToLe(BufLoc, Type, SymVA);		Target->relaxTlsIeToLe(BufLoc, Type, SymVA);
break;		break;
case R_RELAX_TLS_LD_TO_LE:		case R_RELAX_TLS_LD_TO_LE:
Target->relaxTlsLdToLe(BufLoc, Type, SymVA);		Target->relaxTlsLdToLe(BufLoc, Type, SymVA);
break;		break;
case R_RELAX_TLS_GD_TO_LE:		case R_RELAX_TLS_GD_TO_LE:
Target->relaxTlsGdToLe(BufLoc, Type, SymVA);		Target->relaxTlsGdToLe(BufLoc, Type, SymVA);
▲ Show 20 Lines • Show All 290 Lines • Show Last 20 Lines

lld/trunk/ELF/Relocations.h

Show All 32 Lines	enum RelExpr {
R_NEG_TLS,		R_NEG_TLS,
R_PAGE_PC,		R_PAGE_PC,
R_PC,		R_PC,
R_PLT,		R_PLT,
R_PLT_PC,		R_PLT_PC,
R_PPC_OPD,		R_PPC_OPD,
R_PPC_PLT_OPD,		R_PPC_PLT_OPD,
R_PPC_TOC,		R_PPC_TOC,
		R_RELAX_GOT_PC,
R_RELAX_TLS_GD_TO_IE,		R_RELAX_TLS_GD_TO_IE,
R_RELAX_TLS_GD_TO_LE,		R_RELAX_TLS_GD_TO_LE,
R_RELAX_TLS_IE_TO_LE,		R_RELAX_TLS_IE_TO_LE,
R_RELAX_TLS_LD_TO_LE,		R_RELAX_TLS_LD_TO_LE,
R_SIZE,		R_SIZE,
R_THUNK,		R_THUNK,
R_TLS,		R_TLS,
R_TLSGD,		R_TLSGD,
Show All 21 Lines

lld/trunk/ELF/Relocations.cpp

Show First 20 Lines • Show All 221 Lines • ▼ Show 20 Lines

static bool needsPlt(RelExpr Expr) {		static bool needsPlt(RelExpr Expr) {
return Expr == R_PLT_PC \|\| Expr == R_PPC_PLT_OPD \|\| Expr == R_PLT;		return Expr == R_PLT_PC \|\| Expr == R_PPC_PLT_OPD \|\| Expr == R_PLT;
}		}

// True if this expression is of the form Sym - X, where X is a position in the		// True if this expression is of the form Sym - X, where X is a position in the
// file (PC, or GOT for example).		// file (PC, or GOT for example).
static bool isRelExpr(RelExpr Expr) {		static bool isRelExpr(RelExpr Expr) {
return Expr == R_PC \|\| Expr == R_GOTREL \|\| Expr == R_PAGE_PC;		return Expr == R_PC \|\| Expr == R_GOTREL \|\| Expr == R_PAGE_PC \|\|
		Expr == R_RELAX_GOT_PC;
}		}

template <class ELFT>		template <class ELFT>
static bool isStaticLinkTimeConstant(RelExpr E, uint32_t Type,		static bool isStaticLinkTimeConstant(RelExpr E, uint32_t Type,
const SymbolBody &Body) {		const SymbolBody &Body) {
// These expressions always compute a constant		// These expressions always compute a constant
if (E == R_SIZE \|\| E == R_GOT_FROM_END \|\| E == R_GOT_OFF \|\|		if (E == R_SIZE \|\| E == R_GOT_FROM_END \|\| E == R_GOT_OFF \|\|
E == R_MIPS_GOT_LOCAL \|\| E == R_MIPS_GOT_LOCAL_PAGE \|\|		E == R_MIPS_GOT_LOCAL \|\| E == R_MIPS_GOT_LOCAL_PAGE \|\|
▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	for (const Elf_Sym &S : SS->File->getElfSymbols(true)) {
Alias->symbol()->IsUsedInRegularObj = true;		Alias->symbol()->IsUsedInRegularObj = true;
}		}
Out<ELFT>::RelaDyn->addReloc(		Out<ELFT>::RelaDyn->addReloc(
{Target->CopyRel, Out<ELFT>::Bss, SS->OffsetInBss, false, SS, 0});		{Target->CopyRel, Out<ELFT>::Bss, SS->OffsetInBss, false, SS, 0});
}		}

template <class ELFT>		template <class ELFT>
static RelExpr adjustExpr(const elf::ObjectFile<ELFT> &File, SymbolBody &Body,		static RelExpr adjustExpr(const elf::ObjectFile<ELFT> &File, SymbolBody &Body,
bool IsWrite, RelExpr Expr, uint32_t Type) {		bool IsWrite, RelExpr Expr, uint32_t Type,
		const uint8_t *Data, typename ELFT::uint Offset) {
if (Target->needsThunk(Type, File, Body))		if (Target->needsThunk(Type, File, Body))
return R_THUNK;		return R_THUNK;
bool Preemptible = Body.isPreemptible();		bool Preemptible = Body.isPreemptible();
if (Body.isGnuIFunc())		if (Body.isGnuIFunc()) {
Expr = toPlt(Expr);		Expr = toPlt(Expr);
else if (needsPlt(Expr) && !Preemptible)		} else if (!Preemptible) {
		if (needsPlt(Expr))
Expr = fromPlt(Expr);		Expr = fromPlt(Expr);
		if (Expr == R_GOT_PC && Target->canRelaxGot(Type, Data, Offset))
		Expr = R_RELAX_GOT_PC;
		}

if (IsWrite \|\| isStaticLinkTimeConstant<ELFT>(Expr, Type, Body))		if (IsWrite \|\| isStaticLinkTimeConstant<ELFT>(Expr, Type, Body))
return Expr;		return Expr;

// This relocation would require the dynamic linker to write a value to read		// This relocation would require the dynamic linker to write a value to read
// only memory. We can hack around it if we are producing an executable and		// only memory. We can hack around it if we are producing an executable and
// the refered symbol can be preemepted to refer to the executable.		// the refered symbol can be preemepted to refer to the executable.
if (Config->Shared \|\| (Config->Pic && !isRelExpr(Expr))) {		if (Config->Shared \|\| (Config->Pic && !isRelExpr(Expr))) {
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	for (auto I = Rels.begin(), E = Rels.end(); I != E; ++I) {
if (Expr == R_HINT)		if (Expr == R_HINT)
continue;		continue;

uintX_t Offset = C.getOffset(RI.r_offset);		uintX_t Offset = C.getOffset(RI.r_offset);
if (Offset == (uintX_t)-1)		if (Offset == (uintX_t)-1)
continue;		continue;

bool Preemptible = Body.isPreemptible();		bool Preemptible = Body.isPreemptible();
Expr = adjustExpr(File, Body, IsWrite, Expr, Type);		Expr = adjustExpr(File, Body, IsWrite, Expr, Type, Buf, Offset);
if (HasError)		if (HasError)
continue;		continue;

// This relocation does not require got entry, but it is relative to got and		// This relocation does not require got entry, but it is relative to got and
// needs it to be created. Here we request for that.		// needs it to be created. Here we request for that.
if (Expr == R_GOTONLY_PC \|\| Expr == R_GOTREL \|\| Expr == R_PPC_TOC)		if (Expr == R_GOTONLY_PC \|\| Expr == R_GOTREL \|\| Expr == R_PPC_TOC)
Out<ELFT>::Got->HasGotOffRel = true;		Out<ELFT>::Got->HasGotOffRel = true;

▲ Show 20 Lines • Show All 146 Lines • Show Last 20 Lines

lld/trunk/ELF/Target.h

Show First 20 Lines • Show All 82 Lines • ▼ Show 20 Lines	public:
// to support lazy loading.		// to support lazy loading.
unsigned GotPltHeaderEntriesNum = 3;		unsigned GotPltHeaderEntriesNum = 3;

// Set to 0 for variant 2		// Set to 0 for variant 2
unsigned TcbSize = 0;		unsigned TcbSize = 0;

uint32_t ThunkSize = 0;		uint32_t ThunkSize = 0;

		virtual bool canRelaxGot(uint32_t Type, const uint8_t *Data,
		uint64_t Offset) const;
		virtual void relaxGot(uint8_t *Loc, uint64_t Val) const;
virtual void relaxTlsGdToIe(uint8_t *Loc, uint32_t Type, uint64_t Val) const;		virtual void relaxTlsGdToIe(uint8_t *Loc, uint32_t Type, uint64_t Val) const;
virtual void relaxTlsGdToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const;		virtual void relaxTlsGdToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const;
virtual void relaxTlsIeToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const;		virtual void relaxTlsIeToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const;
virtual void relaxTlsLdToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const;		virtual void relaxTlsLdToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const;
};		};

uint64_t getPPC64TocBase();		uint64_t getPPC64TocBase();

const unsigned MipsGPOffset = 0x7ff0;		const unsigned MipsGPOffset = 0x7ff0;

extern TargetInfo *Target;		extern TargetInfo *Target;
TargetInfo *createTarget();		TargetInfo *createTarget();
}		}
}		}

#endif		#endif

lld/trunk/ELF/Target.cpp

Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	public:
bool isTlsInitialExecRel(uint32_t Type) const override;		bool isTlsInitialExecRel(uint32_t Type) const override;
void writeGotPltHeader(uint8_t *Buf) const override;		void writeGotPltHeader(uint8_t *Buf) const override;
void writeGotPlt(uint8_t *Buf, uint64_t Plt) const override;		void writeGotPlt(uint8_t *Buf, uint64_t Plt) const override;
void writePltZero(uint8_t *Buf) const override;		void writePltZero(uint8_t *Buf) const override;
void writePlt(uint8_t *Buf, uint64_t GotEntryAddr, uint64_t PltEntryAddr,		void writePlt(uint8_t *Buf, uint64_t GotEntryAddr, uint64_t PltEntryAddr,
int32_t Index, unsigned RelOff) const override;		int32_t Index, unsigned RelOff) const override;
void relocateOne(uint8_t *Loc, uint32_t Type, uint64_t Val) const override;		void relocateOne(uint8_t *Loc, uint32_t Type, uint64_t Val) const override;

		bool canRelaxGot(uint32_t Type, const uint8_t *Data,
		uint64_t Offset) const override;
		void relaxGot(uint8_t *Loc, uint64_t Val) const override;
void relaxTlsGdToIe(uint8_t *Loc, uint32_t Type, uint64_t Val) const override;		void relaxTlsGdToIe(uint8_t *Loc, uint32_t Type, uint64_t Val) const override;
void relaxTlsGdToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const override;		void relaxTlsGdToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const override;
void relaxTlsIeToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const override;		void relaxTlsIeToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const override;
void relaxTlsLdToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const override;		void relaxTlsLdToLe(uint8_t *Loc, uint32_t Type, uint64_t Val) const override;
};		};

class PPCTargetInfo final : public TargetInfo {		class PPCTargetInfo final : public TargetInfo {
public:		public:
▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines
bool TargetInfo::isTlsInitialExecRel(uint32_t Type) const { return false; }		bool TargetInfo::isTlsInitialExecRel(uint32_t Type) const { return false; }

bool TargetInfo::isTlsLocalDynamicRel(uint32_t Type) const { return false; }		bool TargetInfo::isTlsLocalDynamicRel(uint32_t Type) const { return false; }

bool TargetInfo::isTlsGlobalDynamicRel(uint32_t Type) const {		bool TargetInfo::isTlsGlobalDynamicRel(uint32_t Type) const {
return false;		return false;
}		}

		bool TargetInfo::canRelaxGot(uint32_t Type, const uint8_t *Data,
		uint64_t Offset) const {
		return false;
		}

		void TargetInfo::relaxGot(uint8_t *Loc, uint64_t Val) const {
		llvm_unreachable("Should not have claimed to be relaxable");
		}

void TargetInfo::relaxTlsGdToLe(uint8_t *Loc, uint32_t Type,		void TargetInfo::relaxTlsGdToLe(uint8_t *Loc, uint32_t Type,
uint64_t Val) const {		uint64_t Val) const {
llvm_unreachable("Should not have claimed to be relaxable");		llvm_unreachable("Should not have claimed to be relaxable");
}		}

void TargetInfo::relaxTlsGdToIe(uint8_t *Loc, uint32_t Type,		void TargetInfo::relaxTlsGdToIe(uint8_t *Loc, uint32_t Type,
uint64_t Val) const {		uint64_t Val) const {
llvm_unreachable("Should not have claimed to be relaxable");		llvm_unreachable("Should not have claimed to be relaxable");
▲ Show 20 Lines • Show All 476 Lines • ▼ Show 20 Lines	void X86_64TargetInfo::relocateOne(uint8_t *Loc, uint32_t Type,
case R_X86_64_PC64:		case R_X86_64_PC64:
write64le(Loc, Val);		write64le(Loc, Val);
break;		break;
default:		default:
fatal("unrecognized reloc " + Twine(Type));		fatal("unrecognized reloc " + Twine(Type));
}		}
}		}

		bool X86_64TargetInfo::canRelaxGot(uint32_t Type, const uint8_t *Data,
		uint64_t Offset) const {
		if (Type != R_X86_64_GOTPCRELX && Type != R_X86_64_REX_GOTPCRELX)
		return false;

		// Converting mov foo@GOTPCREL(%rip), %reg to lea foo(%rip), %reg
		// is the only supported relaxation for now.
		return (Offset >= 2 && Data[Offset - 2] == 0x8b);
		}

		void X86_64TargetInfo::relaxGot(uint8_t *Loc, uint64_t Val) const {
		Loc[-2] = 0x8d;
		relocateOne(Loc, R_X86_64_PC32, Val);
		}

// Relocation masks following the #lo(value), #hi(value), #ha(value),		// Relocation masks following the #lo(value), #hi(value), #ha(value),
// #higher(value), #highera(value), #highest(value), and #highesta(value)		// #higher(value), #highera(value), #highest(value), and #highesta(value)
// macros defined in section 4.5.1. Relocation Types of the PPC-elf64abi		// macros defined in section 4.5.1. Relocation Types of the PPC-elf64abi
// document.		// document.
static uint16_t applyPPCLo(uint64_t V) { return V; }		static uint16_t applyPPCLo(uint64_t V) { return V; }
static uint16_t applyPPCHi(uint64_t V) { return V >> 16; }		static uint16_t applyPPCHi(uint64_t V) { return V >> 16; }
static uint16_t applyPPCHa(uint64_t V) { return (V + 0x8000) >> 16; }		static uint16_t applyPPCHa(uint64_t V) { return (V + 0x8000) >> 16; }
static uint16_t applyPPCHigher(uint64_t V) { return V >> 32; }		static uint16_t applyPPCHigher(uint64_t V) { return V >> 32; }
▲ Show 20 Lines • Show All 800 Lines • Show Last 20 Lines

lld/trunk/test/ELF/Inputs/gotpc-relax-und-dso.s

				.globl dsofoo
				.type dsofoo, @function
				dsofoo:
				nop

lld/trunk/test/ELF/gotpc-relax-und-dso.s

				# REQUIRES: x86
				# RUN: llvm-mc -filetype=obj -relax-relocations -triple=x86_64-unknown-linux %s -o %t.o
				# RUN: llvm-mc -filetype=obj -relax-relocations -triple=x86_64-pc-linux %S/Inputs/gotpc-relax-und-dso.s -o %tdso.o
				# RUN: ld.lld -shared %tdso.o -o %t.so
				# RUN: ld.lld -shared %t.o %t.so -o %tout
				# RUN: llvm-readobj -r -s %tout \| FileCheck --check-prefix=RELOC %s
				# RUN: llvm-objdump -d %tout \| FileCheck --check-prefix=DISASM %s

				# RELOC: Relocations [
				# RELOC-NEXT: Section ({{.*}}) .rela.dyn {
				# RELOC-NEXT: 0x20A8 R_X86_64_GLOB_DAT dsofoo 0x0
				# RELOC-NEXT: 0x20B0 R_X86_64_GLOB_DAT foo 0x0
				# RELOC-NEXT: 0x20A0 R_X86_64_GLOB_DAT und 0x0
				# RELOC-NEXT: }
				# RELOC-NEXT: ]

				# 0x101e + 7 - 36 = 0x1001
				# 0x1025 + 7 - 43 = 0x1001
				# DISASM: Disassembly of section .text:
				# DISASM-NEXT: foo:
				# DISASM-NEXT: 1000: 90 nop
				# DISASM: hid:
				# DISASM-NEXT: 1001: 90 nop
				# DISASM: _start:
				# DISASM-NEXT: 1002: 48 8b 05 97 10 00 00 movq 4247(%rip), %rax
				# DISASM-NEXT: 1009: 48 8b 05 90 10 00 00 movq 4240(%rip), %rax
				# DISASM-NEXT: 1010: 48 8b 05 91 10 00 00 movq 4241(%rip), %rax
				# DISASM-NEXT: 1017: 48 8b 05 8a 10 00 00 movq 4234(%rip), %rax
				# DISASM-NEXT: 101e: 48 8d 05 dc ff ff ff leaq -36(%rip), %rax
				# DISASM-NEXT: 1025: 48 8d 05 d5 ff ff ff leaq -43(%rip), %rax
				# DISASM-NEXT: 102c: 48 8b 05 7d 10 00 00 movq 4221(%rip), %rax
				# DISASM-NEXT: 1033: 48 8b 05 76 10 00 00 movq 4214(%rip), %rax
				# DISASM-NEXT: 103a: 8b 05 60 10 00 00 movl 4192(%rip), %eax
				# DISASM-NEXT: 1040: 8b 05 5a 10 00 00 movl 4186(%rip), %eax
				# DISASM-NEXT: 1046: 8b 05 5c 10 00 00 movl 4188(%rip), %eax
				# DISASM-NEXT: 104c: 8b 05 56 10 00 00 movl 4182(%rip), %eax
				# DISASM-NEXT: 1052: 8d 05 a9 ff ff ff leal -87(%rip), %eax
				# DISASM-NEXT: 1058: 8d 05 a3 ff ff ff leal -93(%rip), %eax
				# DISASM-NEXT: 105e: 8b 05 4c 10 00 00 movl 4172(%rip), %eax
				# DISASM-NEXT: 1064: 8b 05 46 10 00 00 movl 4166(%rip), %eax

				.text
				.globl foo
				.type foo, @function
				foo:
				nop

				.globl hid
				.hidden hid
				.type hid, @function
				hid:
				nop

				.globl _start
				.type _start, @function
				_start:
				movq und@GOTPCREL(%rip), %rax
				movq und@GOTPCREL(%rip), %rax
				movq dsofoo@GOTPCREL(%rip), %rax
				movq dsofoo@GOTPCREL(%rip), %rax
				movq hid@GOTPCREL(%rip), %rax
				movq hid@GOTPCREL(%rip), %rax
				movq foo@GOTPCREL(%rip), %rax
				movq foo@GOTPCREL(%rip), %rax
				movl und@GOTPCREL(%rip), %eax
				movl und@GOTPCREL(%rip), %eax
				movl dsofoo@GOTPCREL(%rip), %eax
				movl dsofoo@GOTPCREL(%rip), %eax
				movl hid@GOTPCREL(%rip), %eax
				movl hid@GOTPCREL(%rip), %eax
				movl foo@GOTPCREL(%rip), %eax
				movl foo@GOTPCREL(%rip), %eax

lld/trunk/test/ELF/gotpc-relax.s

				# REQUIRES: x86
				# RUN: llvm-mc -filetype=obj -relax-relocations -triple=x86_64-unknown-linux %s -o %t.o
				# RUN: ld.lld %t.o -o %t1
				# RUN: llvm-readobj -r %t1 \| FileCheck --check-prefix=RELOC %s
				# RUN: llvm-objdump -d %t1 \| FileCheck --check-prefix=DISASM %s

				## There is no relocations.
				# RELOC: Relocations [
				# RELOC: ]

				# 0x11003 + 7 - 10 = 0x11000
				# 0x1100a + 7 - 17 = 0x11000
				# 0x11011 + 7 - 23 = 0x11001
				# 0x11018 + 7 - 30 = 0x11001
				# DISASM: Disassembly of section .text:
				# DISASM-NEXT: foo:
				# DISASM-NEXT: 11000: 90 nop
				# DISASM: hid:
				# DISASM-NEXT: 11001: 90 nop
				# DISASM: ifunc:
				# DISASM-NEXT: 11002: c3 retq
				# DISASM: _start:
				# DISASM-NEXT: 11003: 48 8d 05 f6 ff ff ff leaq -10(%rip), %rax
				# DISASM-NEXT: 1100a: 48 8d 05 ef ff ff ff leaq -17(%rip), %rax
				# DISASM-NEXT: 11011: 48 8d 05 e9 ff ff ff leaq -23(%rip), %rax
				# DISASM-NEXT: 11018: 48 8d 05 e2 ff ff ff leaq -30(%rip), %rax
				# DISASM-NEXT: 1101f: 48 8b 05 da 0f 00 00 movq 4058(%rip), %rax
				# DISASM-NEXT: 11026: 48 8b 05 d3 0f 00 00 movq 4051(%rip), %rax
				# DISASM-NEXT: 1102d: 8d 05 cd ff ff ff leal -51(%rip), %eax
				# DISASM-NEXT: 11033: 8d 05 c7 ff ff ff leal -57(%rip), %eax
				# DISASM-NEXT: 11039: 8d 05 c2 ff ff ff leal -62(%rip), %eax
				# DISASM-NEXT: 1103f: 8d 05 bc ff ff ff leal -68(%rip), %eax
				# DISASM-NEXT: 11045: 8b 05 b5 0f 00 00 movl 4021(%rip), %eax
				# DISASM-NEXT: 1104b: 8b 05 af 0f 00 00 movl 4015(%rip), %eax
				# DISASM-NEXT: 11051: ff 15 b1 0f 00 00 callq *4017(%rip)
				# DISASM-NEXT: 11057: ff 25 a3 0f 00 00 jmpq *4003(%rip)

				.text
				.globl foo
				.type foo, @function
				foo:
				nop

				.globl hid
				.hidden hid
				.type hid, @function
				hid:
				nop

				.text
				.type ifunc STT_GNU_IFUNC
				.globl ifunc
				.type ifunc, @function
				ifunc:
				ret

				.globl _start
				.type _start, @function
				_start:
				movq foo@GOTPCREL(%rip), %rax
				movq foo@GOTPCREL(%rip), %rax
				movq hid@GOTPCREL(%rip), %rax
				movq hid@GOTPCREL(%rip), %rax
				movq ifunc@GOTPCREL(%rip), %rax
				movq ifunc@GOTPCREL(%rip), %rax
				movl foo@GOTPCREL(%rip), %eax
				movl foo@GOTPCREL(%rip), %eax
				movl hid@GOTPCREL(%rip), %eax
				movl hid@GOTPCREL(%rip), %eax
				movl ifunc@GOTPCREL(%rip), %eax
				movl ifunc@GOTPCREL(%rip), %eax

				## We check few other possible instructions
				## to see that they are not "relaxed" by mistake to lea.
				call *foo@GOTPCREL(%rip)
				jmp *ifunc@GOTPCREL(%rip)