This is an archive of the discontinued LLVM Phabricator instance.

Therefore, the BLR instruction gets split into 2; one BL and one BR.
This transformation results in not inserting a speculation barrier on
the architectural execution path.

The mitigation is off by default and can be enabled by the
harden-sls-blr subtarget feature.

As a linker is allowed to clobber X16 and X17 on function calls, the
above code transformation would not be correct in case a linker does so
when N=16 or N=17. Therefore, when the mitigation is enabled, generation
of BLR x16 or BLR x17 is avoided.

As BLRA* indirect calls are not produced by LLVM currently, this does
not aim to implement support for those.

Diff Detail

Event Timeline

kristof.beyls created this revision.Jun 8 2020, 8:13 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 8 2020, 8:13 AM

Herald added subscribers: llvm-commits, danielkiss, hiraditya. · View Herald Transcript

Harbormaster failed remote builds in B59490: Diff 269237!Jun 8 2020, 8:16 AM

kristof.beyls added a child revision: D81403: [IndirectThunks] Make generated MF structure as expected by all instruction selectors..Jun 8 2020, 8:45 AM

kristof.beyls added a parent revision: D81401: [NFC] Refactor ThunkInserter to make it available for all targets..

A few nits, but given the time sensitivity of this the only blocking one is how this will work with PLTs or long-branch veneers,

llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
153	Is it possible for the call to these thunks to go through a long-branch veneer, PLT, etc, which could clobber x16 or x17 before it gets used in the thunk? Maybe we could avoid generating indirect calls using those registers, but we have the exact opposite restriction for tail calls when using BTI (grep for TCRETURNriBTI).
193	Style nit: Variable should be declared further down, when first set.
208	Is this something which could be fixed in ThunkInserter?
232	This comment is a bit misleading, we're not actually splitting the basic block (that's terminology used elsewhere in LLVM for replacing one basic block with two).

ostannard mentioned this in D81405: [AArch64] Avoid incompatibility between SLSBLR mitigation and BTI codegen..Jun 8 2020, 10:14 AM

The updated patch addresses Oliver's review comments.
Especially, it addresses the review feedback regarding potential clobbering of X16 or X17 when the call to the thunks go through a long veneer.
It addresses this by avoiding generating indirect calls that use X16 or X17.

kristof.beyls marked 6 inline comments as done.Jun 10 2020, 7:18 AM

kristof.beyls added inline comments.

llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
208	Maybe, but I think that should be done as a separate patch as it's probably target-independent.
232	Hopefully the new comment is better.

LGTM, with some comments which can be addressed later given the time-sensitivity of this.

llvm/lib/Target/AArch64/AArch64InstrInfo.td
2020 ↗	(On Diff #269838)	Is the `BLRCall` pseudo actually needed, or could we just use `BLR` when not doing the mitigation?
llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
111	`BLRCall` shouldn't be selected when this pass is enabled, so maybe we should assert here?
153	Please add a comment noting that x16 and x17 are deliberately omitted.
159	X30 doesn't need to be in these lists either, it (correctly) isn't allowed with `BLRCallNoIP`.

This revision is now accepted and ready to land.Jun 10 2020, 8:17 AM

kristof.beyls updated this revision to Diff 270124.Jun 11 2020, 6:45 AM

kristof.beyls marked 2 inline comments as done.

kristof.beyls marked 6 inline comments as done.Jun 11 2020, 6:49 AM

kristof.beyls added inline comments.

llvm/lib/Target/AArch64/AArch64InstrInfo.td
2020 ↗	(On Diff #269838)	Indeed BLRCall is not needed. Now removed in the latest version of the patch.
llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
111	I've changed the pseudo expansion to use PseudoInstExpansion. Depending on when that expansion happens in the pipeline (before or after the mitigation pass), one could see BLR or BLRNoIP here. I thought it's safest to just handle both here. There's an assert in ConvertBLRToBL that should catch what we actually care about, i.e. not using X16/X17 (nor LR).

kristof.beyls marked 3 inline comments as done.Jun 11 2020, 11:41 PM

kristof.beyls added inline comments.

llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
208	I've now posted a fix for this in the ThunkInserter in D81403

Closed by commit rGc35ed40f4f1b: [AArch64] Extend AArch64SLSHardeningPass to harden BLR instructions. (authored by kristof.beyls). · Explain WhyJun 11 2020, 11:57 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

Target/

AArch64/

AArch64.h

1 line

AArch64.td

3 lines

AArch64SLSHardening.cpp

335 lines

AArch64Subtarget.h

2 lines

AArch64TargetMachine.cpp

1 line

test/

CodeGen/

AArch64/

O0-pipeline.ll

1 line

O3-pipeline.ll

1 line

speculation-hardening-sls-blr.mir

58 lines

speculation-hardening-sls.ll

69 lines

Diff 269237

llvm/lib/Target/AArch64/AArch64.h

	Show All 33 Lines
	FunctionPass *createAArch64CompressJumpTablesPass();			FunctionPass *createAArch64CompressJumpTablesPass();
	FunctionPass *createAArch64ConditionalCompares();			FunctionPass *createAArch64ConditionalCompares();
	FunctionPass *createAArch64AdvSIMDScalar();			FunctionPass *createAArch64AdvSIMDScalar();
	FunctionPass *createAArch64ISelDag(AArch64TargetMachine &TM,			FunctionPass *createAArch64ISelDag(AArch64TargetMachine &TM,
	CodeGenOpt::Level OptLevel);			CodeGenOpt::Level OptLevel);
	FunctionPass *createAArch64StorePairSuppressPass();			FunctionPass *createAArch64StorePairSuppressPass();
	FunctionPass *createAArch64ExpandPseudoPass();			FunctionPass *createAArch64ExpandPseudoPass();
	FunctionPass *createAArch64SLSHardeningPass();			FunctionPass *createAArch64SLSHardeningPass();
				FunctionPass *createAArch64IndirectThunks();
	FunctionPass *createAArch64SpeculationHardeningPass();			FunctionPass *createAArch64SpeculationHardeningPass();
	FunctionPass *createAArch64LoadStoreOptimizationPass();			FunctionPass *createAArch64LoadStoreOptimizationPass();
	FunctionPass *createAArch64SIMDInstrOptPass();			FunctionPass *createAArch64SIMDInstrOptPass();
	ModulePass *createAArch64PromoteConstantPass();			ModulePass *createAArch64PromoteConstantPass();
	FunctionPass *createAArch64ConditionOptimizerPass();			FunctionPass *createAArch64ConditionOptimizerPass();
	FunctionPass *createAArch64A57FPLoadBalancing();			FunctionPass *createAArch64A57FPLoadBalancing();
	FunctionPass *createAArch64A53Fix835769();			FunctionPass *createAArch64A53Fix835769();
	FunctionPass *createFalkorHWPFFixPass();			FunctionPass *createFalkorHWPFFixPass();
	▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64.td

	Show First 20 Lines • Show All 458 Lines • ▼ Show 20 Lines

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Control codegen mitigation against Straight Line Speculation vulnerability.			// Control codegen mitigation against Straight Line Speculation vulnerability.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def FeatureHardenSlsRetBr : SubtargetFeature<"harden-sls-retbr",			def FeatureHardenSlsRetBr : SubtargetFeature<"harden-sls-retbr",
	"HardenSlsRetBr", "true",			"HardenSlsRetBr", "true",
	"Harden against straight line speculation across RET and BR instructions">;			"Harden against straight line speculation across RET and BR instructions">;
				def FeatureHardenSlsBlr : SubtargetFeature<"harden-sls-blr",
				"HardenSlsBlr", "true",
				"Harden against straight line speculation across BLR instructions">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// AArch64 Processors supported.			// AArch64 Processors supported.
	//			//

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Unsupported features to disable for scheduling models			// Unsupported features to disable for scheduling models
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	▲ Show 20 Lines • Show All 573 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64SLSHardening.cpp

Show All 10 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "AArch64InstrInfo.h"		#include "AArch64InstrInfo.h"
#include "AArch64Subtarget.h"		#include "AArch64Subtarget.h"
#include "Utils/AArch64BaseInfo.h"		#include "Utils/AArch64BaseInfo.h"
#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
		#include "llvm/CodeGen/IndirectThunks.h"
#include "llvm/CodeGen/MachineBasicBlock.h"		#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineInstr.h"		#include "llvm/CodeGen/MachineInstr.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineOperand.h"		#include "llvm/CodeGen/MachineOperand.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/RegisterScavenging.h"		#include "llvm/CodeGen/RegisterScavenging.h"
Show All 25 Lines	public:
}		}

bool runOnMachineFunction(MachineFunction &Fn) override;		bool runOnMachineFunction(MachineFunction &Fn) override;

StringRef getPassName() const override { return AARCH64_SLS_HARDENING_NAME; }		StringRef getPassName() const override { return AARCH64_SLS_HARDENING_NAME; }

private:		private:
bool hardenReturnsAndBRs(MachineBasicBlock &MBB) const;		bool hardenReturnsAndBRs(MachineBasicBlock &MBB) const;
void insertSpeculationBarrier(MachineBasicBlock &MBB,		bool hardenBLRs(MachineBasicBlock &MBB) const;
MachineBasicBlock::iterator MBBI,		MachineBasicBlock &ConvertBLRToBL(MachineBasicBlock &MBB,
DebugLoc DL) const;		MachineBasicBlock::iterator) const;
};		};

} // end anonymous namespace		} // end anonymous namespace

char AArch64SLSHardening::ID = 0;		char AArch64SLSHardening::ID = 0;

INITIALIZE_PASS(AArch64SLSHardening, "aarch64-sls-hardening",		INITIALIZE_PASS(AArch64SLSHardening, "aarch64-sls-hardening",
AARCH64_SLS_HARDENING_NAME, false, false)		AARCH64_SLS_HARDENING_NAME, false, false)

void AArch64SLSHardening::insertSpeculationBarrier(		static void insertSpeculationBarrier(const AArch64Subtarget *ST,
MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,		MachineBasicBlock &MBB,
DebugLoc DL) const {		MachineBasicBlock::iterator MBBI,
		DebugLoc DL,
		bool AlwaysUseISBDSB = false) {
assert(MBBI != MBB.begin() &&		assert(MBBI != MBB.begin() &&
"Must not insert SpeculationBarrierEndBB as only instruction in MBB.");		"Must not insert SpeculationBarrierEndBB as only instruction in MBB.");
assert(std::prev(MBBI)->isBarrier() &&		assert(std::prev(MBBI)->isBarrier() &&
"SpeculationBarrierEndBB must only follow unconditional control flow "		"SpeculationBarrierEndBB must only follow unconditional control flow "
"instructions.");		"instructions.");
assert(std::prev(MBBI)->isTerminator() &&		assert(std::prev(MBBI)->isTerminator() &&
"SpeculatoinBarrierEndBB must only follow terminators.");		"SpeculationBarrierEndBB must only follow terminators.");
if (ST->hasSB())		const TargetInstrInfo *TII = ST->getInstrInfo();
BuildMI(MBB, MBBI, DL, TII->get(AArch64::SpeculationBarrierSBEndBB));		unsigned BarrierOpc = ST->hasSB() && !AlwaysUseISBDSB
else		? AArch64::SpeculationBarrierSBEndBB
BuildMI(MBB, MBBI, DL, TII->get(AArch64::SpeculationBarrierISBDSBEndBB));		: AArch64::SpeculationBarrierISBDSBEndBB;
		if (MBBI == MBB.end() \|\|
		(MBBI->getOpcode() != AArch64::SpeculationBarrierSBEndBB &&
		MBBI->getOpcode() != AArch64::SpeculationBarrierISBDSBEndBB))
		BuildMI(MBB, MBBI, DL, TII->get(BarrierOpc));
}		}

bool AArch64SLSHardening::runOnMachineFunction(MachineFunction &MF) {		bool AArch64SLSHardening::runOnMachineFunction(MachineFunction &MF) {
ST = &MF.getSubtarget<AArch64Subtarget>();		ST = &MF.getSubtarget<AArch64Subtarget>();
TII = MF.getSubtarget().getInstrInfo();		TII = MF.getSubtarget().getInstrInfo();
TRI = MF.getSubtarget().getRegisterInfo();		TRI = MF.getSubtarget().getRegisterInfo();

bool Modified = false;		bool Modified = false;
for (auto &MBB : MF)		for (auto &MBB : MF) {
Modified \|= hardenReturnsAndBRs(MBB);		Modified \|= hardenReturnsAndBRs(MBB);
		Modified \|= hardenBLRs(MBB);
		}

return Modified;		return Modified;
}		}

		static bool isBLR(const MachineInstr &MI) {
		switch (MI.getOpcode()) {
		case AArch64::BLR:
		ostannardUnsubmitted Done Reply Inline Actions `BLRCall` shouldn't be selected when this pass is enabled, so maybe we should assert here? ostannard: `BLRCall` shouldn't be selected when this pass is enabled, so maybe we should assert here?
		kristof.beylsAuthorUnsubmitted Done Reply Inline Actions I've changed the pseudo expansion to use PseudoInstExpansion. Depending on when that expansion happens in the pipeline (before or after the mitigation pass), one could see BLR or BLRNoIP here. I thought it's safest to just handle both here. There's an assert in ConvertBLRToBL that should catch what we actually care about, i.e. not using X16/X17 (nor LR). kristof.beyls: I've changed the pseudo expansion to use PseudoInstExpansion. Depending on when that expansion…
		return true;
		case AArch64::BLRAA:
		case AArch64::BLRAB:
		case AArch64::BLRAAZ:
		case AArch64::BLRABZ:
		llvm_unreachable("Currently, LLVM's code generator does not support "
		"producing BLRA* instructions. Therefore, there's no "
		"support in this pass for those instructions.");
		}
		return false;
		}

bool AArch64SLSHardening::hardenReturnsAndBRs(MachineBasicBlock &MBB) const {		bool AArch64SLSHardening::hardenReturnsAndBRs(MachineBasicBlock &MBB) const {
if (!ST->hardenSlsRetBr())		if (!ST->hardenSlsRetBr())
return false;		return false;
bool Modified = false;		bool Modified = false;
MachineBasicBlock::iterator MBBI = MBB.begin(), E = MBB.end();		MachineBasicBlock::iterator MBBI = MBB.begin(), E = MBB.end();
MachineBasicBlock::iterator NextMBBI;		MachineBasicBlock::iterator NextMBBI;
for (; MBBI != E; MBBI = NextMBBI) {		for (; MBBI != E; MBBI = NextMBBI) {
MachineInstr &MI = *MBBI;		MachineInstr &MI = *MBBI;
NextMBBI = std::next(MBBI);		NextMBBI = std::next(MBBI);
if (MI.isReturn() \|\| isIndirectBranchOpcode(MI.getOpcode())) {		if (MI.isReturn() \|\| isIndirectBranchOpcode(MI.getOpcode())) {
assert(MI.isTerminator());		assert(MI.isTerminator());
insertSpeculationBarrier(MBB, std::next(MBBI), MI.getDebugLoc());		insertSpeculationBarrier(ST, MBB, std::next(MBBI), MI.getDebugLoc());
		Modified = true;
		}
		}
		return Modified;
		}

		static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";

		static std::array<const char *, 32> SLSBLRThunkNames{
		"__llvm_slsblr_thunk_x0", "__llvm_slsblr_thunk_x1",
		"__llvm_slsblr_thunk_x2", "__llvm_slsblr_thunk_x3",
		"__llvm_slsblr_thunk_x4", "__llvm_slsblr_thunk_x5",
		"__llvm_slsblr_thunk_x6", "__llvm_slsblr_thunk_x7",
		"__llvm_slsblr_thunk_x8", "__llvm_slsblr_thunk_x9",
		"__llvm_slsblr_thunk_x10", "__llvm_slsblr_thunk_x11",
		"__llvm_slsblr_thunk_x12", "__llvm_slsblr_thunk_x13",
		"__llvm_slsblr_thunk_x14", "__llvm_slsblr_thunk_x15",
		"__llvm_slsblr_thunk_x16", "__llvm_slsblr_thunk_x17",
		ostannardUnsubmitted Done Reply Inline Actions Is it possible for the call to these thunks to go through a long-branch veneer, PLT, etc, which could clobber x16 or x17 before it gets used in the thunk? Maybe we could avoid generating indirect calls using those registers, but we have the exact opposite restriction for tail calls when using BTI (grep for TCRETURNriBTI). ostannard: Is it possible for the call to these thunks to go through a long-branch veneer, PLT, etc, which…
		ostannardUnsubmitted Done Reply Inline Actions Please add a comment noting that x16 and x17 are deliberately omitted. ostannard: Please add a comment noting that x16 and x17 are deliberately omitted.
		"__llvm_slsblr_thunk_x18", "__llvm_slsblr_thunk_x19",
		"__llvm_slsblr_thunk_x20", "__llvm_slsblr_thunk_x21",
		"__llvm_slsblr_thunk_x22", "__llvm_slsblr_thunk_x23",
		"__llvm_slsblr_thunk_x24", "__llvm_slsblr_thunk_x25",
		"__llvm_slsblr_thunk_x26", "__llvm_slsblr_thunk_x27",
		"__llvm_slsblr_thunk_x28", "__llvm_slsblr_thunk_x29",
		ostannardUnsubmitted Done Reply Inline Actions X30 doesn't need to be in these lists either, it (correctly) isn't allowed with `BLRCallNoIP`. ostannard: X30 doesn't need to be in these lists either, it (correctly) isn't allowed with `BLRCallNoIP`.
		"__llvm_slsblr_thunk_x30", "__llvm_slsblr_thunk_x31",
		};
		static std::array<unsigned, 32> SLSBLRThunkRegs{
		AArch64::X0, AArch64::X1, AArch64::X2, AArch64::X3, AArch64::X4,
		AArch64::X5, AArch64::X6, AArch64::X7, AArch64::X8, AArch64::X9,
		AArch64::X10, AArch64::X11, AArch64::X12, AArch64::X13, AArch64::X14,
		AArch64::X15, AArch64::X16, AArch64::X17, AArch64::X18, AArch64::X19,
		AArch64::X20, AArch64::X21, AArch64::X22, AArch64::X23, AArch64::X24,
		AArch64::X25, AArch64::X26, AArch64::X27, AArch64::X28, AArch64::FP,
		AArch64::LR, AArch64::XZR};

		namespace {
		struct SLSBLRThunkInserter : ThunkInserter<SLSBLRThunkInserter> {
		const char *getThunkPrefix() { return SLSBLRNamePrefix; }
		bool mayUseThunk(const MachineFunction &MF) {
		// FIXME: This could also check if there are any BLRs in the function
		// to more accurately reflect if a thunk will be needed.
		return MF.getSubtarget<AArch64Subtarget>().hardenSlsBlr();
		}
		void insertThunks(MachineModuleInfo &MMI);
		void populateThunk(MachineFunction &MF);
		};
		} // namespace

		void SLSBLRThunkInserter::insertThunks(MachineModuleInfo &MMI) {
		// FIXME: It probably would be possible to filter which thunks to produce
		// based on which registers are actually used in BLR instructions in this
		// function. But would that be a worthwhile optimization?
		for (StringRef Name : SLSBLRThunkNames)
		createThunkFunction(MMI, Name);
		}

		void SLSBLRThunkInserter::populateThunk(MachineFunction &MF) {
		Register ThunkReg;
		ostannardUnsubmitted Done Reply Inline Actions Style nit: Variable should be declared further down, when first set. ostannard: Style nit: Variable should be declared further down, when first set.
		// FIXME: How to better communicate Register number, rather than through
		// name and lookup table?
		assert(MF.getName().startswith(getThunkPrefix()));
		int Index = -1;
		for (int i = 0; i < (int)SLSBLRThunkNames.size(); ++i)
		if (MF.getName() == SLSBLRThunkNames[i]) {
		Index = i;
		break;
		}
		assert(Index != -1);
		ThunkReg = SLSBLRThunkRegs[Index];

		const TargetInstrInfo *TII =
		MF.getSubtarget<AArch64Subtarget>().getInstrInfo();
		// Grab the entry MBB and erase any other blocks. O0 codegen appears to
		ostannardUnsubmitted Done Reply Inline Actions Is this something which could be fixed in ThunkInserter? ostannard: Is this something which could be fixed in ThunkInserter?
		kristof.beylsAuthorUnsubmitted Done Reply Inline Actions Maybe, but I think that should be done as a separate patch as it's probably target-independent. kristof.beyls: Maybe, but I think that should be done as a separate patch as it's probably target-independent.
		kristof.beylsAuthorUnsubmitted Done Reply Inline Actions I've now posted a fix for this in the ThunkInserter in D81403 kristof.beyls: I've now posted a fix for this in the ThunkInserter in D81403
		// generate two bbs for the entry block.
		MachineBasicBlock *Entry = &MF.front();
		Entry->clear();
		while (MF.size() > 1)
		MF.erase(std::next(MF.begin()));

		// These thunks need to consist of the following instructions:
		// __llvm_slsblr_thunk_xN:
		// BR xN
		// barrierInsts
		Entry->addLiveIn(ThunkReg);
		BuildMI(Entry, DebugLoc(), TII->get(AArch64::BR)).addReg(ThunkReg);
		// Make sure the thunks do not make use of the SB extension in case there is
		// a function somewhere that will call to it that for some reason disabled
		// the SB extension locally on that function, even though it's enabled for
		// the module otherwise. Therefore set AlwaysUseISBSDB to true.
		insertSpeculationBarrier(&MF.getSubtarget<AArch64Subtarget>(), *Entry,
		Entry->end(), DebugLoc(), true /AlwaysUseISBDSB/);
		}

		MachineBasicBlock &
		AArch64SLSHardening::ConvertBLRToBL(MachineBasicBlock &MBB,
		MachineBasicBlock::iterator MBBI) const {
		// Split the current basic block as follows:
		ostannardUnsubmitted Done Reply Inline Actions This comment is a bit misleading, we're not actually splitting the basic block (that's terminology used elsewhere in LLVM for replacing one basic block with two). ostannard: This comment is a bit misleading, we're not actually splitting the basic block (that's…
		kristof.beylsAuthorUnsubmitted Done Reply Inline Actions Hopefully the new comment is better. kristof.beyls: Hopefully the new comment is better.
		// Before:
		// \|-----------------------------\|
		// \| ... \|
		// \| instI \|
		// \| BLR xN \|
		// \| instJ \|
		// \| ... \|
		// \|-----------------------------\|
		//
		// After:
		// \|-----------------------------\|
		// \| ... \|
		// \| instI \|
		// \| BL __llvm_slsblr_thunk_xN \|
		// \| instJ \|
		// \| ... \|
		// \|-----------------------------\|
		//
		// __llvm_slsblr_thunk_xN:
		// \|-----------------------------\|
		// \| BR xN \|
		// \| barrierInsts \|
		// \|-----------------------------\|
		//
		// The __llvm_slsblr_thunk_xN thunks are created by the SLSBLRThunkInserter.
		// This function merely needs to transform BLR xN into BL
		// __llvm_slsblr_thunk_xN.

		MachineInstr &BLR = *MBBI;
		assert(isBLR(BLR));
		unsigned BLOpcode;
		Register Reg;
		bool RegIsKilled;
		switch (BLR.getOpcode()) {
		case AArch64::BLR:
		BLOpcode = AArch64::BL;
		Reg = BLR.getOperand(0).getReg();
		RegIsKilled = BLR.getOperand(0).isKill();
		break;
		case AArch64::BLRAA:
		case AArch64::BLRAB:
		case AArch64::BLRAAZ:
		case AArch64::BLRABZ:
		llvm_unreachable("BLRA instructions cannot yet be produced by LLVM, "
		"therefore there is no need to support them for now.");
		default:
		llvm_unreachable("unhandled BLR");
		}
		DebugLoc DL = BLR.getDebugLoc();

		// If we'd like to support also BLRAA and BLRAB instructions, we'd need
		// a lot more different kind of thunks.
		// For example, a
		//
		// BLRAA xN, xM
		//
		// instruction probably would need to be transformed to something like:
		//
		// BL __llvm_slsblraa_thunk_x<N>_x<M>
		//
		// __llvm_slsblraa_thunk_x<N>_x<M>:
		// BRAA x<N>, x<M>
		// barrierInsts
		//
		// Given that about 30 different values of N are possible and about 30
		// different values of M are possible in the above, with the current way
		// of producing indirect thunks, we'd be producing about 30 times 30, i.e.
		// about 900 thunks (where most might not be actually called). This would
		// multiply further by two to support both BLRAA and BLRAB variants of those
		// instructions.
		// If we'd want to support this, we'd probably need to look into a different
		// way to produce thunk functions, based on which variants are actually
		// needed, rather than producing all possible variants.
		// So far, LLVM does never produce BLRA* instructions, so let's leave this
		// for the future when LLVM can start producing BLRA* instructions.
		MachineFunction &MF = *MBBI->getMF();
		MCContext &Context = MBB.getParent()->getContext();
		MCSymbol *Sym = Context.getOrCreateSymbol("__llvm_slsblr_thunk_x" +
		utostr(Reg - AArch64::X0));

		MachineInstr *BL = BuildMI(MBB, MBBI, DL, TII->get(BLOpcode)).addSym(Sym);

		// Now copy the implicit operands from BLR to BL and copy other necessary
		// info.
		// However, both BLR and BL instructions implictly use SP and implicitly
		// define LR. Blindly copying implicit operands would result in SP and LR
		// operands to be present multiple times. While this may not be too much of
		// an issue, let's avoid that for cleanliness, by removing those implicit
		// operands from the BL created above before we copy over all implicit
		// operands from the BLR.
		int ImpLROpIdx = -1;
		int ImpSPOpIdx = -1;
		for (unsigned OpIdx = BL->getNumExplicitOperands();
		OpIdx < BL->getNumOperands(); OpIdx++) {
		MachineOperand Op = BL->getOperand(OpIdx);
		if (!Op.isReg())
		continue;
		if (Op.getReg() == AArch64::LR && Op.isDef())
		ImpLROpIdx = OpIdx;
		if (Op.getReg() == AArch64::SP && !Op.isDef())
		ImpSPOpIdx = OpIdx;
		}
		assert(ImpLROpIdx != -1);
		assert(ImpSPOpIdx != -1);
		int FirstOpIdxToRemove = std::max(ImpLROpIdx, ImpSPOpIdx);
		int SecondOpIdxToRemove = std::min(ImpLROpIdx, ImpSPOpIdx);
		BL->RemoveOperand(FirstOpIdxToRemove);
		BL->RemoveOperand(SecondOpIdxToRemove);
		// Now copy over the implicit operands from the original BLR
		BL->copyImplicitOps(MF, BLR);
		MF.moveCallSiteInfo(&BLR, BL);
		// Also add the register called in the BLR as being used in the called thunk.
		BL->addOperand(MachineOperand::CreateReg(Reg, false /isDef/, true /isImp/,
		RegIsKilled /isKill/));
		// Remove BLR instruction
		MBB.erase(MBBI);

		return MBB;
		}

		bool AArch64SLSHardening::hardenBLRs(MachineBasicBlock &MBB) const {
		if (!ST->hardenSlsBlr())
		return false;
		bool Modified = false;
		MachineBasicBlock::iterator MBBI = MBB.begin(), E = MBB.end();
		MachineBasicBlock::iterator NextMBBI;
		for (; MBBI != E; MBBI = NextMBBI) {
		MachineInstr &MI = *MBBI;
		NextMBBI = std::next(MBBI);
		if (isBLR(MI)) {
		ConvertBLRToBL(MBB, MBBI);
Modified = true;		Modified = true;
}		}
}		}
return Modified;		return Modified;
}		}

FunctionPass *llvm::createAArch64SLSHardeningPass() {		FunctionPass *llvm::createAArch64SLSHardeningPass() {
return new AArch64SLSHardening();		return new AArch64SLSHardening();
}		}

		namespace {
		class AArch64IndirectThunks : public MachineFunctionPass {
		public:
		static char ID;

		AArch64IndirectThunks() : MachineFunctionPass(ID) {}

		StringRef getPassName() const override { return "AArch64 Indirect Thunks"; }

		bool doInitialization(Module &M) override;
		bool runOnMachineFunction(MachineFunction &MF) override;

		void getAnalysisUsage(AnalysisUsage &AU) const override {
		MachineFunctionPass::getAnalysisUsage(AU);
		AU.addRequired<MachineModuleInfoWrapperPass>();
		AU.addPreserved<MachineModuleInfoWrapperPass>();
		}

		private:
		std::tuple<SLSBLRThunkInserter> TIs;

		// FIXME: When LLVM moves to C++17, these can become folds
		template <typename... ThunkInserterT>
		static void initTIs(Module &M,
		std::tuple<ThunkInserterT...> &ThunkInserters) {
		(void)std::initializer_list<int>{
		(std::get<ThunkInserterT>(ThunkInserters).init(M), 0)...};
		}
		template <typename... ThunkInserterT>
		static bool runTIs(MachineModuleInfo &MMI, MachineFunction &MF,
		std::tuple<ThunkInserterT...> &ThunkInserters) {
		bool Modified = false;
		(void)std::initializer_list<int>{
		Modified \|= std::get<ThunkInserterT>(ThunkInserters).run(MMI, MF)...};
		return Modified;
		}
		};

		} // end anonymous namespace

		char AArch64IndirectThunks::ID = 0;

		FunctionPass *llvm::createAArch64IndirectThunks() {
		return new AArch64IndirectThunks();
		}

		bool AArch64IndirectThunks::doInitialization(Module &M) {
		initTIs(M, TIs);
		return false;
		}

		bool AArch64IndirectThunks::runOnMachineFunction(MachineFunction &MF) {
		LLVM_DEBUG(dbgs() << getPassName() << '\n');
		auto &MMI = getAnalysis<MachineModuleInfoWrapperPass>().getMMI();
		return runTIs(MMI, MF, TIs);
		}

llvm/lib/Target/AArch64/AArch64Subtarget.h

Show First 20 Lines • Show All 205 Lines • ▼ Show 20 Lines	protected:
bool DisableLatencySchedHeuristic = false;		bool DisableLatencySchedHeuristic = false;
bool UseRSqrt = false;		bool UseRSqrt = false;
bool Force32BitJumpTables = false;		bool Force32BitJumpTables = false;
bool UseEL1ForTP = false;		bool UseEL1ForTP = false;
bool UseEL2ForTP = false;		bool UseEL2ForTP = false;
bool UseEL3ForTP = false;		bool UseEL3ForTP = false;
bool AllowTaggedGlobals = false;		bool AllowTaggedGlobals = false;
bool HardenSlsRetBr = false;		bool HardenSlsRetBr = false;
		bool HardenSlsBlr = false;
uint8_t MaxInterleaveFactor = 2;		uint8_t MaxInterleaveFactor = 2;
uint8_t VectorInsertExtractBaseCost = 3;		uint8_t VectorInsertExtractBaseCost = 3;
uint16_t CacheLineSize = 0;		uint16_t CacheLineSize = 0;
uint16_t PrefetchDistance = 0;		uint16_t PrefetchDistance = 0;
uint16_t MinPrefetchStride = 1;		uint16_t MinPrefetchStride = 1;
unsigned MaxPrefetchIterationsAhead = UINT_MAX;		unsigned MaxPrefetchIterationsAhead = UINT_MAX;
unsigned PrefFunctionLogAlignment = 0;		unsigned PrefFunctionLogAlignment = 0;
unsigned PrefLoopLogAlignment = 0;		unsigned PrefLoopLogAlignment = 0;
▲ Show 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	public:
/// Return true if the CPU supports any kind of instruction fusion.		/// Return true if the CPU supports any kind of instruction fusion.
bool hasFusion() const {		bool hasFusion() const {
return hasArithmeticBccFusion() \|\| hasArithmeticCbzFusion() \|\|		return hasArithmeticBccFusion() \|\| hasArithmeticCbzFusion() \|\|
hasFuseAES() \|\| hasFuseArithmeticLogic() \|\|		hasFuseAES() \|\| hasFuseArithmeticLogic() \|\|
hasFuseCCSelect() \|\| hasFuseLiterals();		hasFuseCCSelect() \|\| hasFuseLiterals();
}		}

bool hardenSlsRetBr() const { return HardenSlsRetBr; }		bool hardenSlsRetBr() const { return HardenSlsRetBr; }
		bool hardenSlsBlr() const { return HardenSlsBlr; }

bool useEL1ForTP() const { return UseEL1ForTP; }		bool useEL1ForTP() const { return UseEL1ForTP; }
bool useEL2ForTP() const { return UseEL2ForTP; }		bool useEL2ForTP() const { return UseEL2ForTP; }
bool useEL3ForTP() const { return UseEL3ForTP; }		bool useEL3ForTP() const { return UseEL3ForTP; }

bool useRSqrt() const { return UseRSqrt; }		bool useRSqrt() const { return UseRSqrt; }
bool force32BitJumpTables() const { return Force32BitJumpTables; }		bool force32BitJumpTables() const { return Force32BitJumpTables; }
unsigned getMaxInterleaveFactor() const { return MaxInterleaveFactor; }		unsigned getMaxInterleaveFactor() const { return MaxInterleaveFactor; }
▲ Show 20 Lines • Show All 163 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64TargetMachine.cpp

Show First 20 Lines • Show All 630 Lines • ▼ Show 20 Lines	void AArch64PassConfig::addPreSched2() {

// The AArch64SpeculationHardeningPass destroys dominator tree and natural		// The AArch64SpeculationHardeningPass destroys dominator tree and natural
// loop info, which is needed for the FalkorHWPFFixPass and also later on.		// loop info, which is needed for the FalkorHWPFFixPass and also later on.
// Therefore, run the AArch64SpeculationHardeningPass before the		// Therefore, run the AArch64SpeculationHardeningPass before the
// FalkorHWPFFixPass to avoid recomputing dominator tree and natural loop		// FalkorHWPFFixPass to avoid recomputing dominator tree and natural loop
// info.		// info.
addPass(createAArch64SpeculationHardeningPass());		addPass(createAArch64SpeculationHardeningPass());

		addPass(createAArch64IndirectThunks());
addPass(createAArch64SLSHardeningPass());		addPass(createAArch64SLSHardeningPass());

if (TM->getOptLevel() != CodeGenOpt::None) {		if (TM->getOptLevel() != CodeGenOpt::None) {
if (EnableFalkorHWPFFix)		if (EnableFalkorHWPFFix)
addPass(createFalkorHWPFFixPass());		addPass(createFalkorHWPFFixPass());
}		}
}		}

▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/O0-pipeline.ll

	Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Fast Register Allocator			; CHECK-NEXT: Fast Register Allocator
	; CHECK-NEXT: Fixup Statepoint Caller Saved			; CHECK-NEXT: Fixup Statepoint Caller Saved
	; CHECK-NEXT: Lazy Machine Block Frequency Analysis			; CHECK-NEXT: Lazy Machine Block Frequency Analysis
	; CHECK-NEXT: Machine Optimization Remark Emitter			; CHECK-NEXT: Machine Optimization Remark Emitter
	; CHECK-NEXT: Prologue/Epilogue Insertion & Frame Finalization			; CHECK-NEXT: Prologue/Epilogue Insertion & Frame Finalization
	; CHECK-NEXT: Post-RA pseudo instruction expansion pass			; CHECK-NEXT: Post-RA pseudo instruction expansion pass
	; CHECK-NEXT: AArch64 pseudo instruction expansion pass			; CHECK-NEXT: AArch64 pseudo instruction expansion pass
	; CHECK-NEXT: AArch64 speculation hardening pass			; CHECK-NEXT: AArch64 speculation hardening pass
				; CHECK-NEXT: AArch64 Indirect Thunks
	; CHECK-NEXT: AArch64 sls hardening pass			; CHECK-NEXT: AArch64 sls hardening pass
	; CHECK-NEXT: Analyze Machine Code For Garbage Collection			; CHECK-NEXT: Analyze Machine Code For Garbage Collection
	; CHECK-NEXT: Insert fentry calls			; CHECK-NEXT: Insert fentry calls
	; CHECK-NEXT: Insert XRay ops			; CHECK-NEXT: Insert XRay ops
	; CHECK-NEXT: Implement the 'patchable-function' attribute			; CHECK-NEXT: Implement the 'patchable-function' attribute
	; CHECK-NEXT: AArch64 Branch Targets			; CHECK-NEXT: AArch64 Branch Targets
	; CHECK-NEXT: Branch relaxation pass			; CHECK-NEXT: Branch relaxation pass
	; CHECK-NEXT: Unpack machine instruction bundles			; CHECK-NEXT: Unpack machine instruction bundles
	Show All 11 Lines

llvm/test/CodeGen/AArch64/O3-pipeline.ll

	Show First 20 Lines • Show All 172 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Control Flow Optimizer			; CHECK-NEXT: Control Flow Optimizer
	; CHECK-NEXT: Lazy Machine Block Frequency Analysis			; CHECK-NEXT: Lazy Machine Block Frequency Analysis
	; CHECK-NEXT: Tail Duplication			; CHECK-NEXT: Tail Duplication
	; CHECK-NEXT: Machine Copy Propagation Pass			; CHECK-NEXT: Machine Copy Propagation Pass
	; CHECK-NEXT: Post-RA pseudo instruction expansion pass			; CHECK-NEXT: Post-RA pseudo instruction expansion pass
	; CHECK-NEXT: AArch64 pseudo instruction expansion pass			; CHECK-NEXT: AArch64 pseudo instruction expansion pass
	; CHECK-NEXT: AArch64 load / store optimization pass			; CHECK-NEXT: AArch64 load / store optimization pass
	; CHECK-NEXT: AArch64 speculation hardening pass			; CHECK-NEXT: AArch64 speculation hardening pass
				; CHECK-NEXT: AArch64 Indirect Thunks
	; CHECK-NEXT: AArch64 sls hardening pass			; CHECK-NEXT: AArch64 sls hardening pass
	; CHECK-NEXT: MachineDominator Tree Construction			; CHECK-NEXT: MachineDominator Tree Construction
	; CHECK-NEXT: Machine Natural Loop Construction			; CHECK-NEXT: Machine Natural Loop Construction
	; CHECK-NEXT: Falkor HW Prefetch Fix Late Phase			; CHECK-NEXT: Falkor HW Prefetch Fix Late Phase
	; CHECK-NEXT: PostRA Machine Instruction Scheduler			; CHECK-NEXT: PostRA Machine Instruction Scheduler
	; CHECK-NEXT: Analyze Machine Code For Garbage Collection			; CHECK-NEXT: Analyze Machine Code For Garbage Collection
	; CHECK-NEXT: Machine Block Frequency Analysis			; CHECK-NEXT: Machine Block Frequency Analysis
	; CHECK-NEXT: MachinePostDominator Tree Construction			; CHECK-NEXT: MachinePostDominator Tree Construction
	Show All 36 Lines

llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr.mir

This file was added.

				# RUN: llc -verify-machineinstrs -mtriple=aarch64-none-linux-gnu \
				# RUN: -start-before aarch64-sls-hardening \
				# RUN: -stop-after aarch64-sls-hardening -o - %s \
				# RUN: \| FileCheck %s --check-prefixes=CHECK --dump-input-on-failure

				# Check that the BLR SLS hardening transforms a BLR into a BL with operands as
				# expected.
				--- \|
				$__llvm_slsblr_thunk_x8 = comdat any
				@a = dso_local local_unnamed_addr global i32 (...)* null, align 8
				@b = dso_local local_unnamed_addr global i32 0, align 4

				define dso_local void @fn1() local_unnamed_addr "target-features"="+harden-sls-blr" {
				entry:
				%0 = load i32 (), i32 ()* bitcast (i32 (...) @a to i32 ()), align 8
				%call = tail call i32 %0() nounwind
				store i32 %call, i32* @b, align 4
				ret void
				}

				; Function Attrs: naked nounwind
				define linkonce_odr hidden void @__llvm_slsblr_thunk_x8() naked nounwind comdat {
				entry:
				ret void
				}
				...
				---
				name: fn1
				tracksRegLiveness: true
				body: \|
				; CHECK-LABEL: name: fn1
				bb.0.entry:
				liveins: $lr

				early-clobber $sp = frame-setup STRXpre killed $lr, $sp, -16 ; :: (store 8 into %stack.0)
				frame-setup CFI_INSTRUCTION def_cfa_offset 16
				frame-setup CFI_INSTRUCTION offset $w30, -16
				renamable $x8 = ADRP target-flags(aarch64-page) @a
				renamable $x8 = LDRXui killed renamable $x8, target-flags(aarch64-pageoff, aarch64-nc) @a :: (dereferenceable load 8 from `i32 () bitcast (i32 (...) @a to i32 ()**)`)
				BLR killed renamable $x8, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp, implicit-def $w0
				; CHECK: BL <mcsymbol __llvm_slsblr_thunk_x8>, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp, implicit-def $w0, implicit killed $x8
				renamable $x8 = ADRP target-flags(aarch64-page) @b
				STRWui killed renamable $w0, killed renamable $x8, target-flags(aarch64-pageoff, aarch64-nc) @b :: (store 4 into @b)
				early-clobber $sp, $lr = frame-destroy LDRXpost $sp, 16 ; :: (load 8 from %stack.0)
				RET undef $lr


				...
				---
				name: __llvm_slsblr_thunk_x8
				tracksRegLiveness: true
				body: \|
				bb.0.entry:
				liveins: $x8

				BR $x8
				SpeculationBarrierISBDSBEndBB
				...

llvm/test/CodeGen/AArch64/speculation-hardening-sls.ll

	; RUN: llc -mattr=harden-sls-retbr -verify-machineinstrs -mtriple=aarch64-none-linux-gnu < %s \| FileCheck %s --check-prefixes=CHECK,ISBDSB --dump-input-on-failure			; RUN: llc -mattr=harden-sls-retbr,harden-sls-blr -verify-machineinstrs -mtriple=aarch64-none-linux-gnu < %s \| FileCheck %s --check-prefixes=CHECK,ISBDSB --dump-input-on-failure
	; RUN: llc -mattr=harden-sls-retbr -mattr=+sb -verify-machineinstrs -mtriple=aarch64-none-linux-gnu < %s \| FileCheck %s --check-prefixes=CHECK,SB --dump-input-on-failure			; RUN: llc -mattr=harden-sls-retbr,harden-sls-blr -mattr=+sb -verify-machineinstrs -mtriple=aarch64-none-linux-gnu < %s \| FileCheck %s --check-prefixes=CHECK,SB --dump-input-on-failure


	; Function Attrs: norecurse nounwind readnone			; Function Attrs: norecurse nounwind readnone
	define dso_local i32 @double_return(i32 %a, i32 %b) local_unnamed_addr {			define dso_local i32 @double_return(i32 %a, i32 %b) {
	entry:			entry:
	%cmp = icmp sgt i32 %a, 0			%cmp = icmp sgt i32 %a, 0
	br i1 %cmp, label %if.then, label %if.else			br i1 %cmp, label %if.then, label %if.else

	if.then: ; preds = %entry			if.then: ; preds = %entry
	%div = sdiv i32 %a, %b			%div = sdiv i32 %a, %b
	br label %return			br label %return

	if.else: ; preds = %entry			if.else: ; preds = %entry
	%div1 = sdiv i32 %b, %a			%div1 = sdiv i32 %b, %a
	br label %return			br label %return

	return: ; preds = %if.else, %if.then			return: ; preds = %if.else, %if.then
	%retval.0 = phi i32 [ %div, %if.then ], [ %div1, %if.else ]			%retval.0 = phi i32 [ %div, %if.then ], [ %div1, %if.else ]
	ret i32 %retval.0			ret i32 %retval.0
	; CHECK-LABEL: double_return:			; CHECK-LABEL: double_return:
	; CHECK: {{ret$}}			; CHECK: {{ret$}}
	; ISBDSB-NEXT: dsb sy			; ISBDSB-NEXT: dsb sy
	; ISBDSB-NEXT: isb			; ISBDSB-NEXT: isb
	; SB-NEXT: {{ sb$}}			; SB-NEXT: {{ sb$}}
	; CHECK: {{ret$}}			; CHECK: {{ret$}}
	; ISBDSB-NEXT: dsb sy			; ISBDSB-NEXT: dsb sy
	; ISBDSB-NEXT: isb			; ISBDSB-NEXT: isb
	; SB-NEXT: {{ sb$}}			; SB-NEXT: {{ sb$}}
				; CHECK-NEXT: .Lfunc_end
	}			}

	@__const.indirect_branch.ptr = private unnamed_addr constant [2 x i8] [i8 blockaddress(@indirect_branch, %return), i8* blockaddress(@indirect_branch, %l2)], align 8			@__const.indirect_branch.ptr = private unnamed_addr constant [2 x i8] [i8 blockaddress(@indirect_branch, %return), i8* blockaddress(@indirect_branch, %l2)], align 8

	; Function Attrs: norecurse nounwind readnone			; Function Attrs: norecurse nounwind readnone
	define dso_local i32 @indirect_branch(i32 %a, i32 %b, i32 %i) {			define dso_local i32 @indirect_branch(i32 %a, i32 %b, i32 %i) {
				; CHECK-LABEL: indirect_branch:
	entry:			entry:
	%idxprom = sext i32 %i to i64			%idxprom = sext i32 %i to i64
	%arrayidx = getelementptr inbounds [2 x i8], [2 x i8]* @__const.indirect_branch.ptr, i64 0, i64 %idxprom			%arrayidx = getelementptr inbounds [2 x i8], [2 x i8]* @__const.indirect_branch.ptr, i64 0, i64 %idxprom
	%0 = load i8, i8* %arrayidx, align 8			%0 = load i8, i8* %arrayidx, align 8
	indirectbr i8* %0, [label %return, label %l2]			indirectbr i8* %0, [label %return, label %l2]
				; CHECK: br x
				; ISBDSB-NEXT: dsb sy
				; ISBDSB-NEXT: isb
				; SB-NEXT: {{ sb$}}

	l2: ; preds = %entry			l2: ; preds = %entry
	br label %return			br label %return
				; CHECK: {{ret$}}
				; ISBDSB-NEXT: dsb sy
				; ISBDSB-NEXT: isb
				; SB-NEXT: {{ sb$}}

	return: ; preds = %entry, %l2			return: ; preds = %entry, %l2
	%retval.0 = phi i32 [ 1, %l2 ], [ 0, %entry ]			%retval.0 = phi i32 [ 1, %l2 ], [ 0, %entry ]
	ret i32 %retval.0			ret i32 %retval.0
	; CHECK-LABEL: indirect_branch:
	; CHECK: br x
	; ISBDSB-NEXT: dsb sy
	; ISBDSB-NEXT: isb
	; SB-NEXT: {{ sb$}}
	; CHECK: {{ret$}}			; CHECK: {{ret$}}
	; ISBDSB-NEXT: dsb sy			; ISBDSB-NEXT: dsb sy
	; ISBDSB-NEXT: isb			; ISBDSB-NEXT: isb
	; SB-NEXT: {{ sb$}}			; SB-NEXT: {{ sb$}}
				; CHECK-NEXT: .Lfunc_end
	}			}

	; Check that RETAA and RETAB instructions are also protected as expected.			; Check that RETAA and RETAB instructions are also protected as expected.
	define dso_local i32 @ret_aa(i32 returned %a) local_unnamed_addr "target-features"="+neon,+v8.3a" "sign-return-address"="all" "sign-return-address-key"="a_key" {			define dso_local i32 @ret_aa(i32 returned %a) local_unnamed_addr "target-features"="+neon,+v8.3a" "sign-return-address"="all" "sign-return-address-key"="a_key" {
	entry:			entry:
	; CHECK-LABEL: ret_aa:			; CHECK-LABEL: ret_aa:
	; CHECK: {{ retaa$}}			; CHECK: {{ retaa$}}
	; ISBDSB-NEXT: dsb sy			; ISBDSB-NEXT: dsb sy
	; ISBDSB-NEXT: isb			; ISBDSB-NEXT: isb
	; SB-NEXT: {{ sb$}}			; SB-NEXT: {{ sb$}}
				; CHECK-NEXT: .Lfunc_end
	ret i32 %a			ret i32 %a
	}			}

	define dso_local i32 @ret_ab(i32 returned %a) local_unnamed_addr "target-features"="+neon,+v8.3a" "sign-return-address"="all" "sign-return-address-key"="b_key" {			define dso_local i32 @ret_ab(i32 returned %a) local_unnamed_addr "target-features"="+neon,+v8.3a" "sign-return-address"="all" "sign-return-address-key"="b_key" {
	entry:			entry:
	; CHECK-LABEL: ret_ab:			; CHECK-LABEL: ret_ab:
	; CHECK: {{ retab$}}			; CHECK: {{ retab$}}
	; ISBDSB-NEXT: dsb sy			; ISBDSB-NEXT: dsb sy
	; ISBDSB-NEXT: isb			; ISBDSB-NEXT: isb
	; SB-NEXT: {{ sb$}}			; SB-NEXT: {{ sb$}}
				; CHECK-NEXT: .Lfunc_end
	ret i32 %a			ret i32 %a
	}			}

				define dso_local i32 @indirect_call(
				i32 (...)* nocapture %f1, i32 (...)* nocapture %f2) {
				entry:
				; CHECK-LABEL: indirect_call:
				%callee.knr.cast = bitcast i32 (...)* %f1 to i32 ()*
				%call = tail call i32 %callee.knr.cast()
				; CHECK: bl {{__llvm_slsblr_thunk_x[0-9]+$}}
				%callee.knr.cast1 = bitcast i32 (...)* %f2 to i32 ()*
				%call2 = tail call i32 %callee.knr.cast1()
				; CHECK: bl {{__llvm_slsblr_thunk_x[0-9]+$}}
				%add = add nsw i32 %call2, %call
				ret i32 %add
				; CHECK: .Lfunc_end
				}

				; verify calling through a function pointer.
				@a = dso_local local_unnamed_addr global i32 (...)* null, align 8
				@b = dso_local local_unnamed_addr global i32 0, align 4
				define dso_local void @indirect_call_global() local_unnamed_addr {
				; CHECK-LABEL: indirect_call_global:
				entry:
				%0 = load i32 (), i32 ()* bitcast (i32 (...) @a to i32 ()), align 8
				%call = tail call i32 %0() nounwind
				; CHECK: bl {{__llvm_slsblr_thunk_x[0-9]+$}}
				store i32 %call, i32* @b, align 4
				ret void
				; CHECK: .Lfunc_end
				}

				; CHECK-label: __llvm_slsblr_thunk_x0:
				; CHECK: br x0
				; ISBDSB-NEXT: dsb sy
				; ISBDSB-NEXT: isb
				; SB-NEXT: dsb sy
				; SB-NEXT: isb
				; CHECK-NEXT: .Lfunc_end
				; CHECK-label: __llvm_slsblr_thunk_x19:
				; CHECK: br x19
				; ISBDSB-NEXT: dsb sy
				; ISBDSB-NEXT: isb
				; SB-NEXT: dsb sy
				; SB-NEXT: isb
				; CHECK-NEXT: .Lfunc_end

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Extend AArch64SLSHardeningPass to harden BLR instructions.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 269237

llvm/lib/Target/AArch64/AArch64.h

llvm/lib/Target/AArch64/AArch64.td

llvm/lib/Target/AArch64/AArch64SLSHardening.cpp

llvm/lib/Target/AArch64/AArch64Subtarget.h

llvm/lib/Target/AArch64/AArch64TargetMachine.cpp

llvm/test/CodeGen/AArch64/O0-pipeline.ll

llvm/test/CodeGen/AArch64/O3-pipeline.ll

llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr.mir

llvm/test/CodeGen/AArch64/speculation-hardening-sls.ll

[AArch64] Extend AArch64SLSHardeningPass to harden BLR instructions.
ClosedPublic