HomePhabricator

[LLD][PowerPC] Fix bug in PC-Relative initial exec

Authored by stefanp on Mar 22 2021, 7:53 AM.

Description

[LLD][PowerPC] Fix bug in PC-Relative initial exec

There is a bug when initial exec is relaxed to local exec.
In the following situation:

InitExec.c

extern __thread unsigned TGlobal;
unsigned getConst(unsigned*);
unsigned addVal(unsigned, unsigned*);

unsigned GetAddrT() {
  return addVal(getConst(&TGlobal), &TGlobal);
}

Def.c

__thread unsigned TGlobal;

unsigned getConst(unsigned* A) {
  return *A + 3;
}

unsigned addVal(unsigned A, unsigned* B) {
  return A + *B;
}

The problem is in InitExec.c but Def.c is required if you want to link the example and see the problem.
To compile everything:

clang -O3 -mcpu=pwr10 -c InitExec.c
clang -O3 -mcpu=pwr10 -c Def.c
ld.lld InitExec.o Def.o -o IeToLe

If you objdump the problem object file:

$ llvm-objdump -dr --mcpu=pwr10 InitExec.o

you will get the following assembly:

0000000000000000 <GetAddrT>:
       0: a6 02 08 7c  	mflr 0
       4: f0 ff c1 fb  	std 30, -16(1)
       8: 10 00 01 f8  	std 0, 16(1)
       c: d1 ff 21 f8  	stdu 1, -48(1)
      10: 00 00 10 04 00 00 60 e4      	pld 3, 0(0), 1
		0000000000000010:  R_PPC64_GOT_TPREL_PCREL34	TGlobal
      18: 14 6a c3 7f  	add 30, 3, 13
		0000000000000019:  R_PPC64_TLS	TGlobal
      1c: 78 f3 c3 7f  	mr	3, 30
      20: 01 00 00 48  	bl 0x20
		0000000000000020:  R_PPC64_REL24_NOTOC	getConst
      24: 78 f3 c4 7f  	mr	4, 30
      28: 30 00 21 38  	addi 1, 1, 48
      2c: 10 00 01 e8  	ld 0, 16(1)
      30: f0 ff c1 eb  	ld 30, -16(1)
      34: a6 03 08 7c  	mtlr 0
      38: 00 00 00 48  	b 0x38
		0000000000000038:  R_PPC64_REL24_NOTOC	addVal

The lines of interest are:

      10: 00 00 10 04 00 00 60 e4      	pld 3, 0(0), 1
		0000000000000010:  R_PPC64_GOT_TPREL_PCREL34	TGlobal
      18: 14 6a c3 7f  	add 30, 3, 13
		0000000000000019:  R_PPC64_TLS	TGlobal
      1c: 78 f3 c3 7f  	mr	3, 30

Which once linked gets turned into:

10010210: ff ff 03 06 00 90 6d 38      	paddi 3, 13, -28672, 0
10010218: 00 00 00 60  	nop
1001021c: 78 f3 c3 7f  	mr	3, 30

The problem is that register 30 is never set after the optimization.

Therefore it is not correct to relax the above instructions by replacing
the add instruction with a nop.
Instead the add instruction should be replaced with a copy (mr) instruction.
If the add uses the same resgiter as input and as ouput then it is safe to
continue to replace the add with a nop.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D95262

Details

Committed
stefanpMar 22 2021, 11:15 AM
Reviewer
MaskRay
Differential Revision
D95262: [LLD][PowerPC] Fix bug in PC-Relative initial exec
Parents
rGcec244354bb1: Fix the order of directives and the target string
Branches
Unknown
Tags
Unknown