This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
2
SelectionDAGBuilder.cpp
-
test/CodeGen/Thumb2/
-
CodeGen/
-
Thumb2/
3
jump-table-bti.ll

Differential D155485

Retain all jump table range checks when using BTI.
ClosedPublic

Authored by simon_tatham on Jul 17 2023, 9:35 AM.

Download Raw Diff

Details

Reviewers

danielkiss
MaskRay
peter.smith
phosek
DavidSpickett
jhenderson
chill

Commits

rG60b98363c7ed: Retain all jump table range checks when using BTI.

Summary

This modifies the switch-statement generation in SelectionDAGBuilder,
specifically the part that generates case clusters of type CC_JumpTable.

A table-based branch of any kind is at risk of being a JOP gadget, if
it doesn't range-check the offset into the table. For some types of
table branch, such as Arm TBB/TBH, the impact of this is limited
because the value loaded from the table is a relative offset of
limited size; for others, such as a MOV PC,Rn computed branch into a
table of further branch instructions, the gadget is fully general.

When compiling for branch-target enforcement via Arm's BTI system,
many of these table branch idioms use branch instructions of types
that do not require a BTI instruction at the branch destination. This
avoids the need to put a BTI at the start of each case handler,
reducing the number of available gadgets with BTIs (i.e. ones
which could be used by a JOP attack in spite of the BTI system). But
without a range check, the use of a non-BTI-requiring branch also
opens up a larger range of followup gadgets for an attacker's use.

A defence against this is to avoid optimising away the range check on
the table offset, even if the compiler believes that no out-of-range
value should be able to reach the table branch. (Rationale: that may
be true for values generated legitimately by the program, but not
those generated maliciously by attackers who have already corrupted
the control flow.)

The effect of keeping the range check and branching to an unreachable
block is that no actual code is generated at that block, so it will
typically point at the end of the function. That may still cause some
kind of unpredictable code execution (such as executing data as code,
or falling through to the next function in the code section), but even
if so, there will only be one possible invalid branch target,
rather than giving an attacker the choice of many possibilities.

This defence is enabled only when branch target enforcement is in use.
Without branch target enforcement, the range check is easily bypassed
anyway, by branching in to a location just after it. But with
enforcement, the attacker will have to enter the jump table dispatcher
at the initial BTI and then go through the range check. (Or, if they
don't, it's because they already have a general BTI-bypassing
gadget.)

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

simon_tatham created this revision.Jul 17 2023, 9:35 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 17 2023, 9:35 AM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald Transcript

simon_tatham requested review of this revision.Jul 17 2023, 9:35 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 17 2023, 9:35 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

I can confirm that this is Arm's preferred way of fixing this. The alternative is to always use a BTI setting indirect branch, but this requires adding BTI j in front of every valid target of the branch, which bloats code-size and leaves more targets for an attacker to indirectly jump to.

Branch Target Identification on AArch64, which doesn't have a direct equivalent of the mov pc, Rn non BTI setting indirect branch is not currently affected. However the architecture has reserved the use of RET x16 for this purpose so code-generation strategy may change to need this in the future on AArch64 too.

Harbormaster completed remote builds in B245894: Diff 541091.Jul 17 2023, 1:14 PM

simonwallis2 added a subscriber: simonwallis2.Jul 18 2023, 12:14 AM

I agree with the hardening side argument. I have checked FallthroughUnreachable uses in CC_BitTests and CC_Range for some CodeGen tests (mostly in X86/) and confirmed that they don't need the "branch-target-enforcement" special case.

A table-based branch of any kind is at risk of being a JOP gadget, if it doesn't range-check the offset into the table. ...

Consider adding CC_JumpTable to the paragraph. Its case label is many lines above and with the default context of git log, it's difficult to see how this change is relevant to CC_JumpTable ...

... many of these table branch idioms use branch instructions that do not set the BTI flag, so they can target instructions without BTI landing pads.

Q: what does "use branch instructions that do not set the BTI flag" mean?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
11324	`CurMF->getMMI().getModule()->getModuleFlag(` is more common to get `Module`.
llvm/test/CodeGen/Thumb2/jump-table-bti.ll
2	There is a bti-jump-table.mir, but I think `jump-table-` as this patch adds is a better name. It groups jump table tests together and matches `ARM/jump-table-.ll`.
2	This test file needs a file-level comment describing what it tests, perhaps: `;; When BTI is enabled, keep the range check for a jump table for hardening, even with a unreachable default`
4	Add quotes, otherwise this seems to trigger some substitutions in zsh: `sed '/^..for-non-bti-build-sed-will-delete-everything-after-this-line/q'`

This revision is now accepted and ready to land.Jul 25 2023, 11:38 PM

In D155485#4506823, @peter.smith wrote:

I can confirm that this is Arm's preferred way of fixing this. The alternative is to always use a BTI setting indirect branch, but this requires adding BTI j in front of every valid target of the branch, which bloats code-size and leaves more targets for an attacker to indirectly jump to.

I see that the AArch64 equivalent (by changing llvm.arm.space to llvm.aarch64.space) has bti j in front of every valid target of the branch, therefore bloating code size.
But how does this leave more targets for an attacker to indirectly jump to? Do you mean that these bti j can be the target of a malicious indirect jump from elsewhere?

However, isn't this as unsafe as jumping to the default basic block (in this test case, the end of the function, after the RET) for an out-of-range branch value in the current switch instruction?

In D155485#4534229, @MaskRay wrote:

I agree with the hardening side argument. I have checked FallthroughUnreachable uses in CC_BitTests and CC_Range for some CodeGen tests (mostly in X86/) and confirmed that they don't need the "branch-target-enforcement" special case.

A table-based branch of any kind is at risk of being a JOP gadget, if it doesn't range-check the offset into the table. ...

Consider adding CC_JumpTable to the paragraph. Its case label is many lines above and with the default context of git log, it's difficult to see how this change is relevant to CC_JumpTable ...

... many of these table branch idioms use branch instructions that do not set the BTI flag, so they can target instructions without BTI landing pads.

Q: what does "use branch instructions that do not set the BTI flag" mean?

AArch32 has quite a few ways to do an indirect branch as the PC is a writeable register. The way the M-profile BTI is defined is in terms of BTI setting and BTI clearing instructions, where the flag is the EPSR.B bit. Intuitively a BTI setting instruction must transfer control to a BTI clearing instruction. Some indirect branches such as MOV PC, <Src Reg> are not BTI setting so they are not required to transfer control to a BTI clearing instruction.

Quotes/paraphrases from the v8-m Arm ARM https://developer.arm.com/documentation/ddi0553/latest/

BTI clearing: Branch Target Identification clearing instruction. Any instruction that clears the EPSR.B bit to zero.
BTI setting: Branch Target Identification setting instruction. Any instruction that sets the EPSR.B bit to one.
...
EPSR.B bit:
Unless otherwise stated, when this bit is set the next executed instruction must be a BTI clearing instruction otherwise an INVSTATE UsageFault is generated.

The BTI setting instructions are:

* BLX.
* BLXNS.
* When the register holding the branch address is not the LR:
– BX.
– BXNS.
* When the address is loaded into the PC:
– LDR (register).
– LDR (literal).
* When the address is loaded into the PC and the base address register is either not the SP or the SP and write-back of the SP does not occur:
– LDR (immediate).
– LDM, LDMIA, LDMFD.
– LDMDB, LDMEA.

The BTI clearing instructions are:

* BTI.
* SG.
* PACBTI.

In D155485#4534309, @MaskRay wrote:

In D155485#4506823, @peter.smith wrote:

I can confirm that this is Arm's preferred way of fixing this. The alternative is to always use a BTI setting indirect branch, but this requires adding BTI j in front of every valid target of the branch, which bloats code-size and leaves more targets for an attacker to indirectly jump to.

I see that the AArch64 equivalent (by changing llvm.arm.space to llvm.aarch64.space) has bti j in front of every valid target of the branch, therefore bloating code size.
But how does this leave more targets for an attacker to indirectly jump to? Do you mean that these bti j can be the target of a malicious indirect jump from elsewhere?

Yes. The theory is that if the attacker finds some vulnerability outside of the function, they can jump to the start of each case statement, which increases the number of gadgets in the program, whether those can be usefully exploited or not will depend on the program. It also has the side-effect of increasing code-size.

However, isn't this as unsafe as jumping to the default basic block (in this test case, the end of the function, after the RET) for an out-of-range branch value in the current switch instruction?

In the AArch32 case the MOV PC, <REG> instruction can land anywhere, not just at the start of a BTI instruction. On AArch64 the BX <reg> instruction is at least constrained to a BTI compatible landing pad.

Hope I've understood the question well enough to answer there.

chill added a subscriber: chill.Jul 26 2023, 2:41 AM

chill added inline comments.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
11324	We likely need to check the for a function attribute, c.f https://github.com/llvm/llvm-project/blob/efa43d785ee600ef4cc14589e4777264f0613ec9/llvm/lib/Target/ARM/ARMMachineFunctionInfo.cpp#L16

Thanks for the review. New version addressing all those comments (I think):

Switched to finding the current module via CurMF as suggested.

Added @chill's suggestion of checking for a function attribute that overrides the module flag, and added some cases of that to the tests.

Added a file-header comment in the test, and zsh-proofing quotes in the sed commands.

Reworded the commit message to explicitly mention CC_JumpTable, and also to talk about branch instructions that do/don't require a BTI at their destination, instead of the more low-level terminology of whether they 'set the BTI flag'.

Harbormaster completed remote builds in B248542: Diff 544744.Jul 27 2023, 7:57 AM

(This review is already marked as accepted, but @chill's point about function attributes required a noticeable change. @chill, I'll land this on Monday on the basis of the previous acceptance if I don't see any objections before then.)

chill accepted this revision.Jul 28 2023, 2:49 AM

LGTM.

Not sure about the sed usage, is it universally available (e.g. llvm-lit internal) or maybe the test would need REQUIRES: system-linux ?

In D155485#4541796, @chill wrote:

Not sure about the sed usage, is it universally available (e.g. llvm-lit internal) or maybe the test would need REQUIRES: system-linux ?

I wasn't sure either, so I checked before deciding to use sed for the test preprocessing. The Software section of the Getting Started guide lists sed as one of the expected tools on the compilation host. And other lit tests already exist that use sed in pipelines without any special REQUIRES: to authorise it – a quick grep finds, for example, llvm/test/CodeGen/AArch64/speculation-hardening.ll.

I guess that means if you're building and testing on Windows, it's up to you to arrange to have all those tools on Windows one way or another. (Perhaps the 'git bash' environment provides good-enough ones?)

Closed by commit rG60b98363c7ed: Retain all jump table range checks when using BTI. (authored by simon_tatham). · Explain WhyJul 31 2023, 2:40 AM

This revision was automatically updated to reflect the committed changes.

simon_tatham added a commit: rG60b98363c7ed: Retain all jump table range checks when using BTI..

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

SelectionDAGBuilder.cpp

28 lines

test/

CodeGen/

Thumb2/

jump-table-bti.ll

129 lines

Diff 545556

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,304 Lines • ▼ Show 20 Lines	switch (I->Kind) {
JumpProb += DefaultProb / 2;		JumpProb += DefaultProb / 2;
FallthroughProb -= DefaultProb / 2;		FallthroughProb -= DefaultProb / 2;
JumpMBB->setSuccProbability(SI, DefaultProb / 2);		JumpMBB->setSuccProbability(SI, DefaultProb / 2);
JumpMBB->normalizeSuccProbs();		JumpMBB->normalizeSuccProbs();
break;		break;
}		}
}		}

if (FallthroughUnreachable)		// If the default clause is unreachable, propagate that knowledge into
		// JTH->FallthroughUnreachable which will use it to suppress the range
		// check.
		//
		// However, don't do this if we're doing branch target enforcement,
		// because a table branch _without_ a range check can be a tempting JOP
		// gadget - out-of-bounds inputs that are impossible in correct
		// execution become possible again if an attacker can influence the
		// control flow. So if an attacker doesn't already have a BTI bypass
		// available, we don't want them to be able to get one out of this
		// table branch.
		if (FallthroughUnreachable) {
		MaskRayUnsubmitted Not Done Reply Inline Actions `CurMF->getMMI().getModule()->getModuleFlag(` is more common to get `Module`. MaskRay: `CurMF->getMMI().getModule()->getModuleFlag(` is more common to get `Module`.
		chillUnsubmitted Not Done Reply Inline Actions We likely need to check the for a function attribute, c.f https://github.com/llvm/llvm-project/blob/efa43d785ee600ef4cc14589e4777264f0613ec9/llvm/lib/Target/ARM/ARMMachineFunctionInfo.cpp#L16 chill: We likely need to check the for a function attribute, c.f https://github.com/llvm/llvm…
		Function &CurFunc = CurMF->getFunction();
		bool HasBranchTargetEnforcement = false;
		if (CurFunc.hasFnAttribute("branch-target-enforcement")) {
		HasBranchTargetEnforcement =
		CurFunc.getFnAttribute("branch-target-enforcement")
		.getValueAsBool();
		} else {
		HasBranchTargetEnforcement =
		CurMF->getMMI().getModule()->getModuleFlag(
		"branch-target-enforcement");
		}
		if (!HasBranchTargetEnforcement)
JTH->FallthroughUnreachable = true;		JTH->FallthroughUnreachable = true;
		}

if (!JTH->FallthroughUnreachable)		if (!JTH->FallthroughUnreachable)
addSuccessorWithProb(CurMBB, Fallthrough, FallthroughProb);		addSuccessorWithProb(CurMBB, Fallthrough, FallthroughProb);
addSuccessorWithProb(CurMBB, JumpMBB, JumpProb);		addSuccessorWithProb(CurMBB, JumpMBB, JumpProb);
CurMBB->normalizeSuccProbs();		CurMBB->normalizeSuccProbs();

// The jump table header will be inserted in our current block, do the		// The jump table header will be inserted in our current block, do the
// range check, and fall through to our fallthrough block.		// range check, and fall through to our fallthrough block.
▲ Show 20 Lines • Show All 622 Lines • Show Last 20 Lines

llvm/test/CodeGen/Thumb2/jump-table-bti.ll

This file was added.

				;; When BTI is enabled, keep the range check for a jump table for hardening,
				;; even with an unreachable default.
				MaskRayUnsubmitted Not Done Reply Inline Actions There is a bti-jump-table.mir, but I think `jump-table-` as this patch adds is a better name. It groups jump table tests together and matches `ARM/jump-table-.ll`. MaskRay: There is a bti-jump-table.mir, but I think `jump-table-*` as this patch adds is a better name.
				MaskRayUnsubmitted Not Done Reply Inline Actions This test file needs a file-level comment describing what it tests, perhaps: `;; When BTI is enabled, keep the range check for a jump table for hardening, even with a unreachable default` MaskRay: This test file needs a file-level comment describing what it tests, perhaps: `;; When BTI is…
				;;
				;; We check with and without the branch-target-enforcement module attribute,
				MaskRayUnsubmitted Not Done Reply Inline Actions Add quotes, otherwise this seems to trigger some substitutions in zsh: `sed '/^..for-non-bti-build-sed-will-delete-everything-after-this-line/q'` MaskRay: Add quotes, otherwise this seems to trigger some substitutions in zsh: `sed '/^..for-non-bti…
				;; and in each case, try overriding it with the opposite function attribute.
				;; Expect to see a range check whenever there is BTI, and not where there
				;; isn't.

				; RUN: sed s/SPACE/4/ %s \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=BTI-TBB
				; RUN: sed s/SPACE/4/ %s \| sed '/test_jumptable/s/{/#0 {/' \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=NOBTI-TBB
				; RUN: sed s/SPACE/4/ %s \| sed '/^..for-non-bti-build-sed-will-delete-everything-after-this-line/q' \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=NOBTI-TBB
				; RUN: sed s/SPACE/4/ %s \| sed '/test_jumptable/s/{/#1 {/' \| sed '/^..for-non-bti-build-sed-will-delete-everything-after-this-line/q' \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=BTI-TBB

				; RUN: sed s/SPACE/400/ %s \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=BTI-TBH
				; RUN: sed s/SPACE/400/ %s \| sed '/test_jumptable/s/{/#0 {/' \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=NOBTI-TBH
				; RUN: sed s/SPACE/400/ %s \| sed '/^..for-non-bti-build-sed-will-delete-everything-after-this-line/q' \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=NOBTI-TBH
				; RUN: sed s/SPACE/400/ %s \| sed '/test_jumptable/s/{/#1 {/' \| sed '/^..for-non-bti-build-sed-will-delete-everything-after-this-line/q' \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=BTI-TBH

				; RUN: sed s/SPACE/400000/ %s \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=BTI-MOV
				; RUN: sed s/SPACE/400000/ %s \| sed '/test_jumptable/s/{/#0 {/' \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=NOBTI-MOV
				; RUN: sed s/SPACE/400000/ %s \| sed '/^..for-non-bti-build-sed-will-delete-everything-after-this-line/q' \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=NOBTI-MOV
				; RUN: sed s/SPACE/400000/ %s \| sed '/test_jumptable/s/{/#1 {/' \| sed '/^..for-non-bti-build-sed-will-delete-everything-after-this-line/q' \| llc -mtriple=thumbv8.1m.main-linux-gnu -mattr=+pacbti -o - \| FileCheck %s --check-prefix=BTI-MOV

				declare i32 @llvm.arm.space(i32, i32)

				attributes #0 = { "branch-target-enforcement"="false" }
				attributes #1 = { "branch-target-enforcement"="true" }

				define ptr @test_jumptable(ptr %src, ptr %dst) {
				entry:
				%sw = load i32, ptr %src, align 4
				%src.postinc = getelementptr inbounds i32, ptr %src, i32 1
				switch i32 %sw, label %default [
				i32 0, label %sw.0
				i32 1, label %sw.1
				i32 2, label %sw.2
				i32 3, label %sw.3
				]

				sw.0:
				%store.0 = call i32 @llvm.arm.space(i32 SPACE, i32 14142)
				store i32 %store.0, ptr %dst, align 4
				br label %exit

				sw.1:
				%store.1 = call i32 @llvm.arm.space(i32 SPACE, i32 31415)
				%dst.1 = getelementptr inbounds i32, ptr %dst, i32 1
				store i32 %store.1, ptr %dst.1, align 4
				br label %exit

				sw.2:
				%store.2 = call i32 @llvm.arm.space(i32 SPACE, i32 27182)
				%dst.2 = getelementptr inbounds i32, ptr %dst, i32 2
				store i32 %store.2, ptr %dst.2, align 4
				br label %exit

				sw.3:
				%store.3 = call i32 @llvm.arm.space(i32 SPACE, i32 16180)
				%dst.3 = getelementptr inbounds i32, ptr %dst, i32 3
				store i32 %store.3, ptr %dst.3, align 4
				br label %exit

				default:
				unreachable

				exit:
				ret ptr %src.postinc
				}

				; NOBTI-TBB: test_jumptable:
				; NOBTI-TBB-NEXT: .fnstart
				; NOBTI-TBB-NEXT: @ %bb
				; NOBTI-TBB-NEXT: ldr [[INDEX:r[0-9]+]], [r0], #4
				; NOBTI-TBB-NEXT: .LCPI
				; NOBTI-TBB-NEXT: tbb [pc, [[INDEX]]]

				; BTI-TBB: test_jumptable:
				; BTI-TBB-NEXT: .fnstart
				; BTI-TBB-NEXT: @ %bb
				; BTI-TBB-NEXT: bti
				; BTI-TBB-NEXT: ldr [[INDEX:r[0-9]+]], [r0], #4
				; BTI-TBB-NEXT: cmp [[INDEX]], #3
				; BTI-TBB-NEXT: bhi .LBB
				; BTI-TBB-NEXT: @ %bb
				; BTI-TBB-NEXT: .LCPI
				; BTI-TBB-NEXT: tbb [pc, [[INDEX]]]

				; NOBTI-TBH: test_jumptable:
				; NOBTI-TBH-NEXT: .fnstart
				; NOBTI-TBH-NEXT: @ %bb
				; NOBTI-TBH-NEXT: ldr [[INDEX:r[0-9]+]], [r0], #4
				; NOBTI-TBH-NEXT: .LCPI
				; NOBTI-TBH-NEXT: tbh [pc, [[INDEX]], lsl #1]

				; BTI-TBH: test_jumptable:
				; BTI-TBH-NEXT: .fnstart
				; BTI-TBH-NEXT: @ %bb
				; BTI-TBH-NEXT: bti
				; BTI-TBH-NEXT: ldr [[INDEX:r[0-9]+]], [r0], #4
				; BTI-TBH-NEXT: cmp [[INDEX]], #3
				; BTI-TBH-NEXT: bhi.w .LBB
				; BTI-TBH-NEXT: @ %bb
				; BTI-TBH-NEXT: .LCPI
				; BTI-TBH-NEXT: tbh [pc, [[INDEX]], lsl #1]

				; NOBTI-MOV: test_jumptable:
				; NOBTI-MOV-NEXT: .fnstart
				; NOBTI-MOV-NEXT: @ %bb
				; NOBTI-MOV-NEXT: ldr [[INDEX:r[0-9]+]], [r0], #4
				; NOBTI-MOV-NEXT: adr.w [[ADDR:r[0-9]+]], .LJTI
				; NOBTI-MOV-NEXT: add.w [[ADDR]], [[ADDR]], [[INDEX]], lsl #2
				; NOBTI-MOV-NEXT: mov pc, [[ADDR]]

				; BTI-MOV: test_jumptable:
				; BTI-MOV-NEXT: .fnstart
				; BTI-MOV-NEXT: @ %bb
				; BTI-MOV-NEXT: bti
				; BTI-MOV-NEXT: ldr [[INDEX:r[0-9]+]], [r0], #4
				; BTI-MOV-NEXT: cmp [[INDEX]], #3
				; BTI-MOV-NEXT: bls .LBB
				; BTI-MOV-NEXT: b.w .LBB
				; BTI-MOV-NEXT: .LBB
				; BTI-MOV-NEXT: adr.w [[ADDR:r[0-9]+]], .LJTI
				; BTI-MOV-NEXT: add.w [[ADDR]], [[ADDR]], [[INDEX]], lsl #2
				; BTI-MOV-NEXT: mov pc, [[ADDR]]

				; for-non-bti-build-sed-will-delete-everything-after-this-line
				!llvm.module.flags = !{!0}
				!0 = !{i32 8, !"branch-target-enforcement", i32 1}