This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/ARM/
-
Target/
-
ARM/
-
ARMInstrThumb.td
-
ARMInstrThumb2.td
-
test/CodeGen/ARM/
-
CodeGen/
-
ARM/
-
readtp.ll
-
thread_pointer.ll

Differential D112600

[ARM] Use hardware TLS register in Thumb2 mode when -mtp=cp15 is passed
ClosedPublic

Authored by ardb on Oct 27 2021, 1:57 AM.

Download Raw Diff

Details

Reviewers

nickdesaulniers
nathanchance
psmith
peter.smith

Commits

rGd7e089f2d6a5: [ARM] Use hardware TLS register in Thumb2 mode when -mtp=cp15 is passed

Summary

In ARM mode, passing -mtp=cp15 forces the use of an inline MRC system register read to move the thread pointer value into a register.

Currently, in Thumb2 mode, -mtp=cp15 is ignored, and a call to the __aeabi_read_tp helper is emitted instead.

This is inconsistent, and breaks the Linux/ARM build for Thumb2 targets, as the Linux kernel does not provide an implementation of __aeabi_read_tp,.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ardb created this revision.Oct 27 2021, 1:57 AM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald TranscriptOct 27 2021, 1:57 AM

ardb requested review of this revision.Oct 27 2021, 1:57 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 27 2021, 1:57 AM

Herald added subscribers: llvm-commits, cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B130880: Diff 382558.Oct 27 2021, 3:13 AM

I tested this against next-20211027, where it resolved the build error and boots in QEMU master without any visible issues. Disassembly between GCC and clang seems to be good. I'll leave it up to @nickdesaulniers and @peter.smith to approve, as I am not very familiar with TableGen.

LGTM as this as CP15 can be used on architecture v6k and above, which maps to IsThumb2.

As an aside from this patch, the Arm state could be considered too permissive as it will permit -mtls=cp15 on architectures that wouldn't have the coprocessor register like arm7tdmi, although GCC also permits this so I guess we're in the set this option with care territory.

This revision is now accepted and ready to land.Oct 27 2021, 9:55 AM

Thanks for the patch!

clang/test/CodeGen/arm-tphard.c

10 ↗

(On Diff #382558)

Let's make this a test under llvm/test/CodeGen/, using IR:

; RUN: llc --mtriple=armv7-linux-gnueabihf -o - %s | FileCheck %s
; RUN: llc --mtriple=thumbv7-linux-gnu -o - %s | FileCheck %s
define dso_local i8* @tphard() "target-features"="+read-tp-hard" {
// CHECK-NOT: __aeabi_read_tp
  %1 = tail call i8* @llvm.thread.pointer()
  ret i8* %1
}

declare i8* @llvm.thread.pointer()

This revision now requires changes to proceed.Oct 27 2021, 10:28 AM

nickdesaulniers mentioned this in D34408: [ARM] - Add the option to directly access TLS pointer.Oct 27 2021, 10:32 AM

nickdesaulniers added subscribers: t.p.northover, rengolin, spetrovic.

ardb added inline comments.Oct 27 2021, 2:30 PM

clang/test/CodeGen/arm-tphard.c
10 ↗	(On Diff #382558)	Let's make this a test under llvm/test/CodeGen/, using IR: Why is that better? ; RUN: llc --mtriple=armv7-linux-gnueabihf -o - %s \| FileCheck %s ; RUN: llc --mtriple=thumbv7-linux-gnu -o - %s \| FileCheck %s Are you sure using this triple forces generation of Thumb2 code? It didn't seem to when I tried. define dso_local i8* @tphard() "target-features"="+read-tp-hard" { // CHECK-NOT: __aeabi_read_tp Are you sure __aeabi_read_tp will appear in the IR for soft TP? %1 = tail call i8* @llvm.thread.pointer() ret i8* %1 } declare i8* @llvm.thread.pointer()

nickdesaulniers added inline comments.Oct 27 2021, 3:09 PM

clang/test/CodeGen/arm-tphard.c
10 ↗	(On Diff #382558)	Why is that better? Because the front-end isn't really involved in this back end code gen bug, so we should just be testing the back end. Are you sure using this triple forces generation of Thumb2 code? It didn't seem to when I tried. Good question; I guess for thumb2 there's no command line flag passed to the compiler that says "I would like thumb[1] as opposed to thumb2?" Maybe @peter.smith can provide some color on that? Is it simply armv7 for thumb2 vs v6 for thumb[1], or something else? Are you sure __aeabi_read_tp will appear in the IR for soft TP? `__aeabi_read_tp` will never appear in the IR (unless someone explicitly called it`. Instead, we're testing that the intrinsic (`@llvm.thread.pointer`) is lowered to either that libcall or `mrc`. `__aeabi_read_tp` may appear in the object file or assembler code generated from that intrinsic call. But also, it looks like there's already coverage for `__aeabi_read_tp` being generated for soft TP. See also llvm/test/CodeGen/ARM/readtp.ll. There's also thread_pointer.ll in that same dir (and a few more tests mentioning `__aeabi_read_tp` that all look like candidates to add these tests to rather than creating a new test file, perhaps.

Drop new Clang CodeGen test, and add the Thumb2 check to an existing backend test instead.

Cool, if you wouldn't mind additional extending this patch to cover the intrinsic that the front end also generates, this LGTM:

diff --git a/llvm/test/CodeGen/ARM/thread_pointer.ll b/llvm/test/CodeGen/ARM/thread_pointer.ll
index c6318a58277c..f1ef2ddac2d0 100644
--- a/llvm/test/CodeGen/ARM/thread_pointer.ll
+++ b/llvm/test/CodeGen/ARM/thread_pointer.ll
@@ -1,4 +1,7 @@
-; RUN: llc -mtriple arm-linux-gnueabi -filetype asm -o - %s | FileCheck %s
+; RUN: llc -mtriple arm-linux-gnueabi -o - %s | FileCheck %s -check-prefix=CHECK-SOFT
+; RUN: llc -mtriple arm-linux-gnueabi -mattr=+read-tp-hard -o - %s | FileCheck %s -check-prefix=CHECK-HARD
+; RUN: llc -mtriple thumbv7-linux-gnueabi -o - %s | FileCheck %s -check-prefix=CHECK-SOFT
+; RUN: llc -mtriple thumbv7-linux-gnueabi -mattr=+read-tp-hard -o - %s | FileCheck %s -check-prefix=CHECK-HARD
 
 declare i8* @llvm.thread.pointer()
 
@@ -8,5 +11,6 @@ entry:
   ret i8* %tmp1
 }
 
-; CHECK: bl __aeabi_read_tp
+; CHECK-SOFT: bl __aeabi_read_tp
+; CHECK-HARD: mrc p15, #0, {{r[0-9]+}}, c13, c0, #3

Thanks for the patch!

This revision is now accepted and ready to land.Oct 27 2021, 3:55 PM

Add another test suggested by Nick.

nickdesaulniers accepted this revision.Oct 27 2021, 4:18 PM

Thanks all. Could someone with commit access please merge this? Thanks.

In D112600#3091959, @ardb wrote:

Thanks all. Could someone with commit access please merge this? Thanks.

Sure thing; I'll pick it up. Thanks again for the patch!

This revision was landed with ongoing or failed builds.Oct 27 2021, 4:46 PM

Closed by commit rGd7e089f2d6a5: [ARM] Use hardware TLS register in Thumb2 mode when -mtp=cp15 is passed (authored by ardb, committed by nickdesaulniers). · Explain Why

This revision was automatically updated to reflect the committed changes.

nickdesaulniers added a commit: rGd7e089f2d6a5: [ARM] Use hardware TLS register in Thumb2 mode when -mtp=cp15 is passed.

Harbormaster completed remote builds in B131070: Diff 382839.Oct 27 2021, 4:59 PM

peter.smith added inline comments.Oct 28 2021, 10:02 AM

clang/test/CodeGen/arm-tphard.c
10 ↗	(On Diff #382558)	Are you sure using this triple forces generation of Thumb2 code? It didn't seem to when I tried. Good question; I guess for thumb2 there's no command line flag passed to the compiler that says "I would like thumb[1] as opposed to thumb2?" Maybe @peter.smith can provide some color on that? Is it simply armv7 for thumb2 vs v6 for thumb[1], or something else? As I understand it, "Thumb2 Technology" to give it the marketing name is still the Thumb (T32) ISA, it just has access to a lot more instructions than Thumb. In the backend this means that the compiler is free to select instructions with the Thumb2 predicate. It doesn't have to if there is an equivalent 2-byte sized Thumb instruction that does the job. In theory to get Thumb only code -march=armv6 might work, armv6 is the intersection of the A, R and M profiles that include no Thumb 2. The architecture for Thumb 2 is close to v7+ although like most things Arm it is more complicated: Armv6k as implemented by one CPU Arm1165t2-s (which I wouldn't expect to see running Linux). All other Arm v6 CPUs including the Arm 1176jzf-s (original Raspberry Pi) do not have Thumb 2. All Arm v7 CPUs have Thumb 2, including M, R and A profiles. All Arm v8 R and A profile CPUs have Thumb 2 as do Armv8-M.mainline from M profile. The Armv8-M.baseline CPUs do not (these are the smallest microcontrollers).

nickdesaulniers added inline comments.Oct 28 2021, 10:49 AM

clang/test/CodeGen/arm-tphard.c
10 ↗	(On Diff #382558)	Thanks for all the historical context! In the backend this means that the compiler is free to select instructions with the Thumb2 predicate. It doesn't have to if there is an equivalent 2-byte sized Thumb instruction that does the job. So I guess this would answer @ardb 's question: Are you sure using this triple forces generation of Thumb2 code? It didn't seem to when I tried. It seems that the answer is "it depends on whether there is an equivalent 2-byte size Thumb instruction that does the job."

Revision Contents

Path

Size

llvm/

lib/

Target/

ARM/

ARMInstrThumb.td

1 line

ARMInstrThumb2.td

3 lines

test/

CodeGen/

ARM/

readtp.ll

2 lines

thread_pointer.ll

8 lines

Diff 382854

llvm/lib/Target/ARM/ARMInstrThumb.td

	Show First 20 Lines • Show All 1,514 Lines • ▼ Show 20 Lines
	//			//

	// __aeabi_read_tp preserves the registers r1-r3.			// __aeabi_read_tp preserves the registers r1-r3.
	// This is a pseudo inst so that we can get the encoding right,			// This is a pseudo inst so that we can get the encoding right,
	// complete with fixup for the aeabi_read_tp function.			// complete with fixup for the aeabi_read_tp function.
	let isCall = 1, Defs = [R0, R12, LR, CPSR], Uses = [SP] in			let isCall = 1, Defs = [R0, R12, LR, CPSR], Uses = [SP] in
	def tTPsoft : tPseudoInst<(outs), (ins), 4, IIC_Br,			def tTPsoft : tPseudoInst<(outs), (ins), 4, IIC_Br,
	[(set R0, ARMthread_pointer)]>,			[(set R0, ARMthread_pointer)]>,
				Requires<[IsThumb, IsReadTPSoft]>,
	Sched<[WriteBr]>;			Sched<[WriteBr]>;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// SJLJ Exception handling intrinsics			// SJLJ Exception handling intrinsics
	//			//

	// eh_sjlj_setjmp() is an instruction sequence to store the return address and			// eh_sjlj_setjmp() is an instruction sequence to store the return address and
	// save #0 in R0 for the non-longjmp case. Since by its nature we may be coming			// save #0 in R0 for the non-longjmp case. Since by its nature we may be coming
	▲ Show 20 Lines • Show All 256 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMInstrThumb2.td

Show First 20 Lines • Show All 4,665 Lines • ▼ Show 20 Lines	def t2CDP2 : T2Cop<0b1111, (outs), (ins p_imm:$cop, imm0_15:$opc1,
let Inst{19-16} = CRn;		let Inst{19-16} = CRn;
let Inst{23-20} = opc1;		let Inst{23-20} = opc1;

let Predicates = [IsThumb2, PreV8];		let Predicates = [IsThumb2, PreV8];
let DecoderNamespace = "Thumb2CoProc";		let DecoderNamespace = "Thumb2CoProc";
}		}


		// Reading thread pointer from coprocessor register
		def : T2Pat<(ARMthread_pointer), (t2MRC 15, 0, 13, 0, 3)>,
		Requires<[IsThumb2, IsReadTPHard]>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// ARMv8.1 Privilege Access Never extension		// ARMv8.1 Privilege Access Never extension
//		//
// SETPAN #imm1		// SETPAN #imm1

def t2SETPAN : T1I<(outs), (ins imm0_1:$imm), NoItinerary, "setpan\t$imm", []>,		def t2SETPAN : T1I<(outs), (ins imm0_1:$imm), NoItinerary, "setpan\t$imm", []>,
T1Misc<0b0110000>, Requires<[IsThumb2, HasV8, HasV8_1a]> {		T1Misc<0b0110000>, Requires<[IsThumb2, HasV8, HasV8_1a]> {
▲ Show 20 Lines • Show All 954 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/readtp.ll

	; RUN: llc -mtriple=armeb-linux-gnueabihf -O2 -mattr=+read-tp-hard %s -o - \| FileCheck %s -check-prefix=CHECK-HARD			; RUN: llc -mtriple=armeb-linux-gnueabihf -O2 -mattr=+read-tp-hard %s -o - \| FileCheck %s -check-prefix=CHECK-HARD
	; RUN: llc -mtriple=armeb-linux-gnueabihf -O2 %s -o - \| FileCheck %s -check-prefix=CHECK-SOFT			; RUN: llc -mtriple=armeb-linux-gnueabihf -O2 %s -o - \| FileCheck %s -check-prefix=CHECK-SOFT
				; RUN: llc -mtriple=thumbv7-linux-gnueabihf -O2 -mattr=+read-tp-hard %s -o - \| FileCheck %s -check-prefix=CHECK-HARD
				; RUN: llc -mtriple=thumbv7-linux-gnueabihf -O2 %s -o - \| FileCheck %s -check-prefix=CHECK-SOFT


	; __thread int counter;			; __thread int counter;
	; void foo() {			; void foo() {
	; counter = 5;			; counter = 5;
	; }			; }


	Show All 12 Lines

llvm/test/CodeGen/ARM/thread_pointer.ll

	; RUN: llc -mtriple arm-linux-gnueabi -filetype asm -o - %s \| FileCheck %s			; RUN: llc -mtriple arm-linux-gnueabi -o - %s \| FileCheck %s -check-prefix=CHECK-SOFT
				; RUN: llc -mtriple arm-linux-gnueabi -mattr=+read-tp-hard -o - %s \| FileCheck %s -check-prefix=CHECK-HARD
				; RUN: llc -mtriple thumbv7-linux-gnueabi -o - %s \| FileCheck %s -check-prefix=CHECK-SOFT
				; RUN: llc -mtriple thumbv7-linux-gnueabi -mattr=+read-tp-hard -o - %s \| FileCheck %s -check-prefix=CHECK-HARD

	declare i8* @llvm.thread.pointer()			declare i8* @llvm.thread.pointer()

	define i8* @test() {			define i8* @test() {
	entry:			entry:
	%tmp1 = call i8* @llvm.thread.pointer()			%tmp1 = call i8* @llvm.thread.pointer()
	ret i8* %tmp1			ret i8* %tmp1
	}			}

	; CHECK: bl __aeabi_read_tp			; CHECK-SOFT: bl __aeabi_read_tp
				; CHECK-HARD: mrc p15, #0, {{r[0-9]+}}, c13, c0, #3