This is an archive of the discontinued LLVM Phabricator instance.

Differential D20622

[ELF] - Added support for jmp/call relaxations when R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX are used.
ClosedPublic

Authored by grimar on May 25 2016, 8:01 AM.

Download Raw Diff

Details

Reviewers

ruiu
• rafael

Commits

rG95433df129c4: [ELF] - Added support for jmp/call relaxations when…
rLLD270721: [ELF] - Added support for jmp/call relaxations when…
rL270721: [ELF] - Added support for jmp/call relaxations when…

Summary

D15779 introduced basic approach to support new relaxations.
This patch implements relaxations for jmp and call instructions,
described in System V Application Binary Interface AMD64 Architecture Processor
Supplement Draft Version 0.99.8 (https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf,
B.2 "B.2 Optimize GOTPCRELX Relocations")

Diff Detail

Repository: rL LLVM

Event Timeline

grimar updated this revision to Diff 58423.May 25 2016, 8:01 AM

grimar retitled this revision from to [ELF] - Added support for jmp/call relaxations when R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX are used..

grimar updated this object.

grimar added reviewers: • rafael, ruiu.

grimar added subscribers: llvm-commits, grimar.

ELF/Target.cpp

+++ ELF/Target.cpp
@@ -740,14 +740,39 @@
                                 uint64_t Offset) const {
if (Type != R_X86_64_GOTPCRELX && Type != R_X86_64_REX_GOTPCRELX)
  return false;
// Converting mov foo@GOTPCREL(%rip), %reg to lea foo(%rip), %reg

// is the only supported relaxation for now.

return (Offset >= 2 && Data[Offset - 2] == 0x8b);

This deletes the "Offset >= 2" check. Do you know if it will be needed
once all optimizations are implemented? If it is there just to guard
against corrupted inputs, for now just remove the Offset argument and
pass Data+Offset to this function. That can be another patch.

+ if (Op == 0x8b) {
+ // Convert mov foo@GOTPCREL(%rip), %reg to lea foo(%rip), %reg.
+ *(Loc - 2) = 0x8d;

Use an early return here.

+ } else if (Op == 0xff) {
+ if (ModRm == 0x15) {
+ ABI says we can convert call *foo@GOTPCREL(%rip) to nop call foo.
+ Instead we convert to addr32 call foo, where addr32 is instruction
+ // prefix. That makes result expression to be a single instruction.

Interesting idea. For tls data16 and rex64 are used. Any idea which
one is better when? Would you mind sending hjl.tools@gmail.com this
suggestion for addition in the psabi?

+ *(Loc - 2) = 0x67; addr32 prefix
+ *(Loc - 1) = 0xe8; call

early return.

+ } else {
+ ModRm == 0x25.
+ Convert jmp *foo@GOTPCREL(%rip) to jmp foo nop.

Can't you use a prefix in here?

+ *(Loc - 2) = 0xe9; jmp
+ *(Loc + 3) = 0x90; nop
+ Loc -= 1;
+ Val += 1;
+ }
+ }
+
relocateOne(Loc, R_X86_64_PC32, Val);
}

Cheers,
Rafael

In D20622#439251, @rafael wrote:
ELF/Target.cpp

+++ ELF/Target.cpp
@@ -740,14 +740,39 @@
                                 uint64_t Offset) const {
if (Type != R_X86_64_GOTPCRELX && Type != R_X86_64_REX_GOTPCRELX)
  return false;
// Converting mov foo@GOTPCREL(%rip), %reg to lea foo(%rip), %reg

// is the only supported relaxation for now.

return (Offset >= 2 && Data[Offset - 2] == 0x8b);
This deletes the "Offset >= 2" check. Do you know if it will be needed
once all optimizations are implemented? If it is there just to guard
against corrupted inputs, for now just remove the Offset argument and
pass Data+Offset to this function. That can be another patch.

I think there should be no way to have such relocations and Offset < 2.
All instructions that can be relaxed seems to be save to check without this.
I guess that gold/bfd did that just to protect against incorrect inputs. I do not think this
is really needed.

+ if (Op == 0x8b) {
+ // Convert mov foo@GOTPCREL(%rip), %reg to lea foo(%rip), %reg.
+ *(Loc - 2) = 0x8d;

Use an early return here.

+ } else if (Op == 0xff) {
+ if (ModRm == 0x15) {
+ ABI says we can convert call *foo@GOTPCREL(%rip) to nop call foo.
+ Instead we convert to addr32 call foo, where addr32 is instruction
+ // prefix. That makes result expression to be a single instruction.

Interesting idea. For tls data16 and rex64 are used. Any idea which
one is better when? Would you mind sending hjl.tools@gmail.com this
suggestion for addition in the psabi?

Unfortunately it is not mine idea. It is what I did not understood at first from gnu ld output,
but after some research about what it is doing, I think I got the idea right.

+ *(Loc - 2) = 0x67; addr32 prefix
+ *(Loc - 1) = 0xe8; call

early return.

+ } else {
+ ModRm == 0x25.
+ Convert jmp *foo@GOTPCREL(%rip) to jmp foo nop.

Can't you use a prefix in here?

I did not investigate that yet. I guess there might be some trouble with incompatibility
of prefixes with some instructions, but that is just a quess. bfd do the same here and
I didn't have chance to dig here. So they use prefix for call and does not do that for jmp.
gold does not relax jmp/call at all it seems.

+ *(Loc - 2) = 0xe9; jmp
+ *(Loc + 3) = 0x90; nop
+ Loc -= 1;
+ Val += 1;
+ }
+ }
+
relocateOne(Loc, R_X86_64_PC32, Val);
}
Cheers,
Rafael

In D20622#439251, @rafael wrote:

Can't you use a prefix in here?

I`ll try to investigate this tomorrow.

George.

In D20622#439251, @rafael wrote:

+ // Convert jmp *foo@GOTPCREL(%rip) to jmp foo nop.

Can't you use a prefix in here?

Ah, I think I know ! ABI says:
jmp *foo@GOTPCREL(%rip) can be relaxed to jmp foo nop

For call it gives 2 ways:
call foo nop
nop call foo

But for jump only one way allowed . Therefore we can't switch nop to something else,
like adding instruction prefix without violation of ABI.

Cheers,
Rafael

Addressed review comments.

LGTM with nit.

ELF/Target.cpp
764 ↗	(On Diff #58442)	Make this an assert, since we only got here if canRelaxGot returned true.
774 ↗	(On Diff #58442)	Replace comment with assert.

This revision is now accepted and ready to land.May 25 2016, 9:50 AM

Closed by commit rL270721: [ELF] - Added support for jmp/call relaxations when… (authored by grimar). · Explain WhyMay 25 2016, 9:57 AM

This revision was automatically updated to reflect the committed changes.

grimar marked 2 inline comments as done.

In D20622#439404, @rafael wrote:

LGTM with nit.

Thanks for so fast reviews of this !

Revision Contents

Path

Size

lld/

trunk/

ELF/

Target.cpp

38 lines

test/

ELF/

gotpc-relax.s

30 lines

Diff 58446

lld/trunk/ELF/Target.cpp

Show First 20 Lines • Show All 734 Lines • ▼ Show 20 Lines	default:
fatal("unrecognized reloc " + Twine(Type));		fatal("unrecognized reloc " + Twine(Type));
}		}
}		}

bool X86_64TargetInfo::canRelaxGot(uint32_t Type, const uint8_t *Data,		bool X86_64TargetInfo::canRelaxGot(uint32_t Type, const uint8_t *Data,
uint64_t Offset) const {		uint64_t Offset) const {
if (Type != R_X86_64_GOTPCRELX && Type != R_X86_64_REX_GOTPCRELX)		if (Type != R_X86_64_GOTPCRELX && Type != R_X86_64_REX_GOTPCRELX)
return false;		return false;
		const uint8_t Op = Data[Offset - 2];
// Converting mov foo@GOTPCREL(%rip), %reg to lea foo(%rip), %reg		const uint8_t ModRm = Data[Offset - 1];
// is the only supported relaxation for now.		// Relax mov.
return (Offset >= 2 && Data[Offset - 2] == 0x8b);		if (Op == 0x8b)
		return true;
		// Relax call and jmp.
		return Op == 0xff && (ModRm == 0x15 \|\| ModRm == 0x25);
}		}

void X86_64TargetInfo::relaxGot(uint8_t *Loc, uint64_t Val) const {		void X86_64TargetInfo::relaxGot(uint8_t *Loc, uint64_t Val) const {
Loc[-2] = 0x8d;		const uint8_t Op = Loc[-2];
		const uint8_t ModRm = Loc[-1];

		// Convert mov foo@GOTPCREL(%rip), %reg to lea foo(%rip), %reg.
		if (Op == 0x8b) {
		*(Loc - 2) = 0x8d;
		relocateOne(Loc, R_X86_64_PC32, Val);
		return;
		}

		assert(Op == 0xff);
		if (ModRm == 0x15) {
		// ABI says we can convert call *foo@GOTPCREL(%rip) to nop call foo.
		// Instead we convert to addr32 call foo, where addr32 is instruction
		// prefix. That makes result expression to be a single instruction.
		*(Loc - 2) = 0x67; // addr32 prefix
		*(Loc - 1) = 0xe8; // call
		} else {
		assert(ModRm == 0x25);
		// Convert jmp *foo@GOTPCREL(%rip) to jmp foo nop.
		// jmp doesn't return, so it is fine to use nop here, it is just a stub.
		*(Loc - 2) = 0xe9; // jmp
		*(Loc + 3) = 0x90; // nop
		Loc -= 1;
		Val += 1;
		}
relocateOne(Loc, R_X86_64_PC32, Val);		relocateOne(Loc, R_X86_64_PC32, Val);
}		}

// Relocation masks following the #lo(value), #hi(value), #ha(value),		// Relocation masks following the #lo(value), #hi(value), #ha(value),
// #higher(value), #highera(value), #highest(value), and #highesta(value)		// #higher(value), #highera(value), #highest(value), and #highesta(value)
// macros defined in section 4.5.1. Relocation Types of the PPC-elf64abi		// macros defined in section 4.5.1. Relocation Types of the PPC-elf64abi
// document.		// document.
static uint16_t applyPPCLo(uint64_t V) { return V; }		static uint16_t applyPPCLo(uint64_t V) { return V; }
▲ Show 20 Lines • Show All 803 Lines • Show Last 20 Lines

lld/trunk/test/ELF/gotpc-relax.s

	Show All 26 Lines
	# DISASM-NEXT: 1101f: 48 8b 05 da 0f 00 00 movq 4058(%rip), %rax			# DISASM-NEXT: 1101f: 48 8b 05 da 0f 00 00 movq 4058(%rip), %rax
	# DISASM-NEXT: 11026: 48 8b 05 d3 0f 00 00 movq 4051(%rip), %rax			# DISASM-NEXT: 11026: 48 8b 05 d3 0f 00 00 movq 4051(%rip), %rax
	# DISASM-NEXT: 1102d: 8d 05 cd ff ff ff leal -51(%rip), %eax			# DISASM-NEXT: 1102d: 8d 05 cd ff ff ff leal -51(%rip), %eax
	# DISASM-NEXT: 11033: 8d 05 c7 ff ff ff leal -57(%rip), %eax			# DISASM-NEXT: 11033: 8d 05 c7 ff ff ff leal -57(%rip), %eax
	# DISASM-NEXT: 11039: 8d 05 c2 ff ff ff leal -62(%rip), %eax			# DISASM-NEXT: 11039: 8d 05 c2 ff ff ff leal -62(%rip), %eax
	# DISASM-NEXT: 1103f: 8d 05 bc ff ff ff leal -68(%rip), %eax			# DISASM-NEXT: 1103f: 8d 05 bc ff ff ff leal -68(%rip), %eax
	# DISASM-NEXT: 11045: 8b 05 b5 0f 00 00 movl 4021(%rip), %eax			# DISASM-NEXT: 11045: 8b 05 b5 0f 00 00 movl 4021(%rip), %eax
	# DISASM-NEXT: 1104b: 8b 05 af 0f 00 00 movl 4015(%rip), %eax			# DISASM-NEXT: 1104b: 8b 05 af 0f 00 00 movl 4015(%rip), %eax
	# DISASM-NEXT: 11051: ff 15 b1 0f 00 00 callq *4017(%rip)			# DISASM-NEXT: 11051: 67 e8 a9 ff ff ff callq -87 <foo>
	# DISASM-NEXT: 11057: ff 25 a3 0f 00 00 jmpq *4003(%rip)			# DISASM-NEXT: 11057: 67 e8 a3 ff ff ff callq -93 <foo>
				# DISASM-NEXT: 1105d: 67 e8 9e ff ff ff callq -98 <hid>
				# DISASM-NEXT: 11063: 67 e8 98 ff ff ff callq -104 <hid>
				# DISASM-NEXT: 11069: ff 15 91 0f 00 00 callq *3985(%rip)
				# DISASM-NEXT: 1106f: ff 15 8b 0f 00 00 callq *3979(%rip)
				# DISASM-NEXT: 11075: e9 86 ff ff ff jmp -122 <foo>
				# DISASM-NEXT: 1107a: 90 nop
				# DISASM-NEXT: 1107b: e9 80 ff ff ff jmp -128 <foo>
				# DISASM-NEXT: 11080: 90 nop
				# DISASM-NEXT: 11081: e9 7b ff ff ff jmp -133 <hid>
				# DISASM-NEXT: 11086: 90 nop
				# DISASM-NEXT: 11087: e9 75 ff ff ff jmp -139 <hid>
				# DISASM-NEXT: 1108c: 90 nop
				# DISASM-NEXT: 1108d: ff 25 6d 0f 00 00 jmpq *3949(%rip)
				# DISASM-NEXT: 11093: ff 25 67 0f 00 00 jmpq *3943(%rip)

	.text			.text
	.globl foo			.globl foo
	.type foo, @function			.type foo, @function
	foo:			foo:
	nop			nop

	.globl hid			.globl hid
	Show All 20 Lines
	movq ifunc@GOTPCREL(%rip), %rax			movq ifunc@GOTPCREL(%rip), %rax
	movl foo@GOTPCREL(%rip), %eax			movl foo@GOTPCREL(%rip), %eax
	movl foo@GOTPCREL(%rip), %eax			movl foo@GOTPCREL(%rip), %eax
	movl hid@GOTPCREL(%rip), %eax			movl hid@GOTPCREL(%rip), %eax
	movl hid@GOTPCREL(%rip), %eax			movl hid@GOTPCREL(%rip), %eax
	movl ifunc@GOTPCREL(%rip), %eax			movl ifunc@GOTPCREL(%rip), %eax
	movl ifunc@GOTPCREL(%rip), %eax			movl ifunc@GOTPCREL(%rip), %eax

	## We check few other possible instructions
	## to see that they are not "relaxed" by mistake to lea.
	call *foo@GOTPCREL(%rip)			call *foo@GOTPCREL(%rip)
				call *foo@GOTPCREL(%rip)
				call *hid@GOTPCREL(%rip)
				call *hid@GOTPCREL(%rip)
				call *ifunc@GOTPCREL(%rip)
				call *ifunc@GOTPCREL(%rip)
				jmp *foo@GOTPCREL(%rip)
				jmp *foo@GOTPCREL(%rip)
				jmp *hid@GOTPCREL(%rip)
				jmp *hid@GOTPCREL(%rip)
				jmp *ifunc@GOTPCREL(%rip)
	jmp *ifunc@GOTPCREL(%rip)			jmp *ifunc@GOTPCREL(%rip)