This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/X86/
-
Target/
-
X86/
-
CMakeLists.txt
-
X86.h
3/5
X86SpeculativeExecutionSideEffectSuppression.cpp
-
X86TargetMachine.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
O0-pipeline.ll
1/1
O3-pipeline.ll
-
speculative-execution-side-effect-suppression-omit-branch-lfences.ll
-
speculative-execution-side-effect-suppression-only-first-lfence.ll
-
speculative-execution-side-effect-suppression-only-lfence-non-const.ll
1/1
speculative-execution-side-effect-suppression.ll

Differential D75939

[x86][seses] Introduce SESES pass for LVI
ClosedPublic

Authored by zbrid on Mar 10 2020, 10:05 AM.

Download Raw Diff

Details

Reviewers

craig.topper
jyknight
george.burgess.iv

Commits

rGbf95cf4a6816: [x86][seses] Introduce SESES pass for LVI

Summary

This is an implementation of Speculative Execution Side Effect
Suppression which is intended as a last resort mitigation against Load
Value Injection, LVI, a newly disclosed speculative execution side
channel vulnerability.

One pager:
https://software.intel.com/security-software-guidance/software-guidance/load-value-injection

Deep dive:
https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection

The mitigation consists of a compiler pass that inserts an LFENCE before
each memory read instruction, memory write instruction, and the first
branch instruction in a group of terminators at the end of a basic
block. The goal is to prevent speculative execution, potentially based
on misspeculated conditions and/or containing secret data, from leaking
that data via side channels embedded in such instructions.

This is something of a last-resort mitigation: it is expected to have
extreme performance implications and it may not be a complete mitigation
due to trying to enumerate side channels.

In addition to the full version of the mitigation, this patch
implements three flags to turn off part of the mitigation. These flags
are disabled by default. The flags are not intended to result in a
secure variant of the mitigation. The flags are intended to be used by
users who would like to experiment with improving the performance of
the mitigation. I ran benchmarks with each of these flags enabled in
order to find if there was any room for further optimization of LFENCE
placement with respect to LVI.

Performance Testing Results

When applying this mitigation to BoringSSL, we see the following
results. These are a summary/aggregation of the performance changes when
this mitigation is applied versus when no mitigation is applied.

Fully Mitigated vs Baseline
Geometric mean
0.071 (Note: This can be read as the ops/s of the mitigated
program was 7.1% of the ops/s of the unmitigated program.)
Minimum
0.041
Quartile 1
0.060
Median
0.063
Quartile 3
0.077
Maximum
0.230

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

zbrid created this revision.Mar 10 2020, 10:05 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2020, 10:05 AM

Herald added subscribers: llvm-commits, jfb, hiraditya, mgorny. · View Herald Transcript

zbrid added a reviewer: craig.topper.Mar 10 2020, 10:16 AM

zbrid added reviewers: jyknight, george.burgess.iv.Mar 10 2020, 10:20 AM

Add link to bug to FIXME

Harbormaster failed remote builds in B48709: Diff 249431!Mar 10 2020, 10:53 AM

Harbormaster failed remote builds in B48719: Diff 249446!Mar 10 2020, 11:26 AM

Thanks so much for working on this!

Keeping in mind that I'm not an x86 backend expert, this approach and implementation (especially allowing people to experiment with different flags) look reasonable to me. Please wait for approval from one of the other reviewers before landing.

This revision is now accepted and ready to land.Mar 10 2020, 5:04 PM

Haven't looked into the code, but some suggestions for the test. Insignificant parts should be deleted. For example, .note.GNU-stack is not needed. If the codegen has nothing to do with CFI, .cfi_startproc and .cfi_endproc should be omitted. I added .Lfoo$local as a dso_local related assembler/linker optimization. It should be omitted. Tricky code sequence in the assembly should be documented. A file level comments helps readers understand what the test is about.

llvm/test/CodeGen/X86/speculative-execution-side-effect-suppression.ll
162	.note.GNU-stack is not needed.

craig.topper added inline comments.Mar 11 2020, 9:36 AM

llvm/lib/Target/X86/X86SpeculativeExecutionSideEffectSuppression.cpp
37	Something like x86-seses-one-lfence-per-bb might make a clearer name?
147	Can this be ordered earlier to avoid the forward declaration?
llvm/test/CodeGen/X86/O3-pipeline.ll
184	Drop " Pass". Doesn't look like most other passes refer to themselves as Pass

What is the intention of this set of patches in relation to D75938? It was unclear to me whether you intended to commit this implementation or were just offering it as an alternative for discussion.

In D75939#1917701, @andrew.w.kaylor wrote:

What is the intention of this set of patches in relation to D75938? It was unclear to me whether you intended to commit this implementation or were just offering it as an alternative for discussion.

I offer these patches for discussion and will upstream in folks in the LLVM community would like them to be upstreamed. If this approach and the other approach are both desired by the LLVM community, then I will work with Intel to decide whether to merge the approaches into a single framework/pass or not.

If you have thoughts on whether or not this should be upstreamed, please let me know and feel free to add to the discussion.

For code review comments:

I will address these shortly.

In D75939#1918453, @zbrid wrote:

In D75939#1917701, @andrew.w.kaylor wrote:

What is the intention of this set of patches in relation to D75938? It was unclear to me whether you intended to commit this implementation or were just offering it as an alternative for discussion.

I offer these patches for discussion and will upstream in folks in the LLVM community would like them to be upstreamed. If this approach and the other approach are both desired by the LLVM community, then I will work with Intel to decide whether to merge the approaches into a single framework/pass or not.

If you have thoughts on whether or not this should be upstreamed, please let me know and feel free to add to the discussion.

For code review comments:

I will address these shortly.

I see it this way. We definitely need D75935 to mitigate RET instructions, and D75934 to mitigate indirect calls/jumps. The SESES patch (with D75944) and D75936+D75937 each use very different approaches to mitigate loads. So the choice is D75939+D75944 against D75936+D75937.

zbrid added child revisions: D75940: [x86][seses] Add documentation for SESES, D75941: [x86][seses] No LFENCEs in basic blocks w/o loads.Mar 12 2020, 2:51 PM

zbrid added a child revision: D76101: DO NOT MERGE - [x86][seses] SESES ALL CHANGES.

Update tests (CHECK-NEXT, rm some extra stuff)
ClangFormat
Update pass name
Change flag to better name

Rm forward declaration

Updated based on all the comments.

EDIT: I could do some more work to remove extra stuff from the tests, but I did a bit for now. I'll update those a bit more later. I need to understand FileCheck and LLVM IR better for further changes.

Harbormaster completed remote builds in B49073: Diff 250088.Mar 12 2020, 5:23 PM

Harbormaster completed remote builds in B49074: Diff 250089.

jdoerfert added a subscriber: jdoerfert.Mar 18 2020, 3:45 PM

jdoerfert added inline comments.

llvm/lib/Target/X86/X86SpeculativeExecutionSideEffectSuppression.cpp
3	Nit: one line is sufficient here
112	Style: No braces around single statements. (also elsewhere)

Update based on jdoerfert's style comments

zbrid marked an inline comment as done.Apr 27 2020, 10:28 AM

Update all tests to be autogenerated by update_llc_test_checks.py

Harbormaster failed remote builds in B54831: Diff 260370!Apr 27 2020, 11:17 AM

Harbormaster failed remote builds in B54838: Diff 260383!Apr 27 2020, 11:49 AM

I don't think that this feature will be secure unless it is also used with -mlvi-cfi. Specifically, it is not sufficient to mitigate a RET simply by placing an LFENCE before it. There must also be a read from RSP's pointee just prior to that LFENCE. Also, indirect calls/jumps from memory must be decomposed into discrete load and call/jump from register operations with an interposed LFENCE. The -mlvi-cfi enables an X86 target feature that performs both of these mitigations correctly.

Also, I think that all of your lit tests for various option combinations can be combined into a single file, with different FileCheck prefixes corresponding to different mitigation configurations.

sconstab added inline comments.Apr 27 2020, 11:55 AM

llvm/lib/Target/X86/X86SpeculativeExecutionSideEffectSuppression.cpp
88	There should probably be a CFI check here, e.g.: if (!Subtarget.useLVIControlFlowIntegrity()) { report_fatal_error("SESES must be used with -mlvi-cfi", false); }

In D75939#2005920, @sconstab wrote:

I don't think that this feature will be secure unless it is also used with -mlvi-cfi. Specifically, it is not sufficient to mitigate a RET simply by placing an LFENCE before it. There must also be a read from RSP's pointee just prior to that LFENCE. Also, indirect calls/jumps from memory must be decomposed into discrete load and call/jump from register operations with an interposed LFENCE. The -mlvi-cfi enables an X86 target feature that performs both of these mitigations correctly.

Also, I think that all of your lit tests for various option combinations can be combined into a single file, with different FileCheck prefixes corresponding to different mitigation configurations.

Good point on the tests, I'll update them accordingly.
Also thanks for reminding me about the -mlvi-cfi flag. I'll add a change to enable that along with this pass.

[seses] Consolidate SESES tests to a single file

Harbormaster failed remote builds in B54852: Diff 260423!Apr 27 2020, 1:28 PM

@sconstab - For this patch, I'm going to update it to mention it doesn't mitigate indirect branches and returns and to that users should add -mlvi-cfi if they'd like that functionality. I'll follow up with a patch that adds the lvi-cfi x86 subtarget feature when seses is enable. That patch touches a lot of new places, so it'll be easier to review in a separate place. I'll add the report fatal error in the follow up patch too.

[seses] Tell users about not mitigating against indirect branches

Also mention how to mitigate against indirect branches

Harbormaster failed remote builds in B54874: Diff 260460!Apr 27 2020, 3:39 PM

Closed by commit rGbf95cf4a6816: [x86][seses] Introduce SESES pass for LVI (authored by zbrid). · Explain WhyMay 11 2020, 9:40 AM

This revision was automatically updated to reflect the committed changes.

(I don't have an opinion on this patch, but gbiv said above "Please wait for approval from one of the other reviewers before landing." and I don't see approval from anyone else on phab. Maybe it was in email on the list and didn't make it here?)

sconstab mentioned this in D79910: [x86][seses] Add clang flag; Use lvi-cfi with seses.Jun 15 2020, 8:26 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

X86/

CMakeLists.txt

1 line

X86.h

2 lines

X86SpeculativeExecutionSideEffectSuppression.cpp

161 lines

X86TargetMachine.cpp

11 lines

test/

CodeGen/

X86/

O0-pipeline.ll

1 line

O3-pipeline.ll

1 line

speculative-execution-side-effect-suppression-omit-branch-lfences.ll

160 lines

speculative-execution-side-effect-suppression-only-first-lfence.ll

150 lines

speculative-execution-side-effect-suppression-only-lfence-non-const.ll

161 lines

speculative-execution-side-effect-suppression.ll

161 lines

Diff 249431

llvm/lib/Target/X86/CMakeLists.txt

Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	set(sources
X86OptimizeLEAs.cpp		X86OptimizeLEAs.cpp
X86PadShortFunction.cpp		X86PadShortFunction.cpp
X86RegisterBankInfo.cpp		X86RegisterBankInfo.cpp
X86RegisterInfo.cpp		X86RegisterInfo.cpp
X86RetpolineThunks.cpp		X86RetpolineThunks.cpp
X86SelectionDAGInfo.cpp		X86SelectionDAGInfo.cpp
X86ShuffleDecodeConstantPool.cpp		X86ShuffleDecodeConstantPool.cpp
X86SpeculativeLoadHardening.cpp		X86SpeculativeLoadHardening.cpp
		X86SpeculativeExecutionSideEffectSuppression.cpp
X86Subtarget.cpp		X86Subtarget.cpp
X86TargetMachine.cpp		X86TargetMachine.cpp
X86TargetObjectFile.cpp		X86TargetObjectFile.cpp
X86TargetTransformInfo.cpp		X86TargetTransformInfo.cpp
X86VZeroUpper.cpp		X86VZeroUpper.cpp
X86WinAllocaExpander.cpp		X86WinAllocaExpander.cpp
X86WinEHState.cpp		X86WinEHState.cpp
X86InsertWait.cpp		X86InsertWait.cpp
Show All 9 Lines

llvm/lib/Target/X86/X86.h

	Show First 20 Lines • Show All 132 Lines • ▼ Show 20 Lines
	/// fp exceptions when strict-fp enabled.			/// fp exceptions when strict-fp enabled.
	FunctionPass *createX86InsertX87waitPass();			FunctionPass *createX86InsertX87waitPass();

	InstructionSelector *createX86InstructionSelector(const X86TargetMachine &TM,			InstructionSelector *createX86InstructionSelector(const X86TargetMachine &TM,
	X86Subtarget &,			X86Subtarget &,
	X86RegisterBankInfo &);			X86RegisterBankInfo &);

	FunctionPass *createX86SpeculativeLoadHardeningPass();			FunctionPass *createX86SpeculativeLoadHardeningPass();
				FunctionPass *createX86SpeculativeExecutionSideEffectSuppressionPass();

	void initializeEvexToVexInstPassPass(PassRegistry &);			void initializeEvexToVexInstPassPass(PassRegistry &);
	void initializeFixupBWInstPassPass(PassRegistry &);			void initializeFixupBWInstPassPass(PassRegistry &);
	void initializeFixupLEAPassPass(PassRegistry &);			void initializeFixupLEAPassPass(PassRegistry &);
	void initializeFPSPass(PassRegistry &);			void initializeFPSPass(PassRegistry &);
	void initializeWinEHStatePassPass(PassRegistry &);			void initializeWinEHStatePassPass(PassRegistry &);
	void initializeX86AvoidSFBPassPass(PassRegistry &);			void initializeX86AvoidSFBPassPass(PassRegistry &);
	void initializeX86CallFrameOptimizationPass(PassRegistry &);			void initializeX86CallFrameOptimizationPass(PassRegistry &);
	void initializeX86CmovConverterPassPass(PassRegistry &);			void initializeX86CmovConverterPassPass(PassRegistry &);
	void initializeX86CondBrFoldingPassPass(PassRegistry &);			void initializeX86CondBrFoldingPassPass(PassRegistry &);
	void initializeX86DomainReassignmentPass(PassRegistry &);			void initializeX86DomainReassignmentPass(PassRegistry &);
	void initializeX86ExecutionDomainFixPass(PassRegistry &);			void initializeX86ExecutionDomainFixPass(PassRegistry &);
	void initializeX86ExpandPseudoPass(PassRegistry &);			void initializeX86ExpandPseudoPass(PassRegistry &);
	void initializeX86FlagsCopyLoweringPassPass(PassRegistry &);			void initializeX86FlagsCopyLoweringPassPass(PassRegistry &);
	void initializeX86OptimizeLEAPassPass(PassRegistry &);			void initializeX86OptimizeLEAPassPass(PassRegistry &);
	void initializeX86SpeculativeLoadHardeningPassPass(PassRegistry &);			void initializeX86SpeculativeLoadHardeningPassPass(PassRegistry &);
				void initializeX86SpeculativeExecutionSideEffectSuppressionPassPass(PassRegistry &);
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -void initializeX86SpeculativeExecutionSideEffectSuppressionPassPass(PassRegistry &); +void initializeX86SpeculativeExecutionSideEffectSuppressionPassPass( + PassRegistry &); Lint: Pre-merge checks: clang-format: please reformat the code ``` -void…

	namespace X86AS {			namespace X86AS {
	enum : unsigned {			enum : unsigned {
	GS = 256,			GS = 256,
	FS = 257,			FS = 257,
	SS = 258,			SS = 258,
	PTR32_SPTR = 270,			PTR32_SPTR = 270,
	PTR32_UPTR = 271,			PTR32_UPTR = 271,
	PTR64 = 272			PTR64 = 272
	};			};
	} // End X86AS namespace			} // End X86AS namespace

	} // End llvm namespace			} // End llvm namespace

	#endif			#endif

llvm/lib/Target/X86/X86SpeculativeExecutionSideEffectSuppression.cpp

This file was added.

				//===-- X86SpeculativeExecutionSideEffectSuppression.cpp ------------------===//
				//===-- X86 Speculative Execution Side Effect Suppression Pass -----------===//
				//
				jdoerfertUnsubmitted Done Reply Inline Actions Nit: one line is sufficient here jdoerfert: Nit: one line is sufficient here
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				/// \file
				///
				/// This file contains the X86 implementation of the speculative execution side
				/// effect suppression mitigation.
				///
				//===----------------------------------------------------------------------===//

				#include "X86.h"
				#include "X86InstrInfo.h"
				#include "X86Subtarget.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/CodeGen/MachineFunction.h"
				#include "llvm/CodeGen/MachineFunctionPass.h"
				#include "llvm/CodeGen/MachineInstrBuilder.h"
				#include "llvm/Pass.h"
				using namespace llvm;

				#define DEBUG_TYPE "x86-seses"

				STATISTIC(NumLFENCEsInserted, "Number of lfence instructions inserted");

				static cl::opt<bool> EnableSpeculativeExecutionSideEffectSuppression(
				"x86-seses-enable",
				cl::desc("Force enable speculative execution side effect suppresion"),
				cl::init(false), cl::Hidden);

				static cl::opt<bool> OnlyFirstLFENCE(
				"x86-seses-only-first-lfence",
				cl::desc(
				craig.topperUnsubmitted Done Reply Inline Actions Something like x86-seses-one-lfence-per-bb might make a clearer name? craig.topper: Something like x86-seses-one-lfence-per-bb might make a clearer name?
				"Omit all lfences other than the first to be placed in a basic block."),
				cl::init(false), cl::Hidden);

				static cl::opt<bool> OnlyLFENCENonConst(
				"x86-seses-only-lfence-non-const",
				cl::desc("Only lfence before groups of terminators where at least one "
				"branch instruction has an input to the addressing mode that is a "
				"register other than %rip."),
				cl::init(false), cl::Hidden);

				static cl::opt<bool>
				OmitBranchLFENCEs("x86-seses-omit-branch-lfences",
				cl::desc("Omit all lfences before branch instructions."),
				cl::init(false), cl::Hidden);

				static bool hasConstantAddressingMode(const MachineInstr &MI);

				namespace {

				class X86SpeculativeExecutionSideEffectSuppressionPass
				: public MachineFunctionPass {
				public:
				X86SpeculativeExecutionSideEffectSuppressionPass() : MachineFunctionPass(ID) {}

				static char ID;
				StringRef getPassName() const override {
				return "X86 Speculative Execution Side Effect Suppression Pass";
				}

				bool runOnMachineFunction(MachineFunction &MF) override;
				};
				} // namespace

				char X86SpeculativeExecutionSideEffectSuppressionPass::ID = 0;

				bool X86SpeculativeExecutionSideEffectSuppressionPass::runOnMachineFunction(
				MachineFunction &MF) {
				if (!EnableSpeculativeExecutionSideEffectSuppression)
				return false;

				LLVM_DEBUG(dbgs() << "********** " << getPassName() << " : " << MF.getName()
				<< " **********\n");
				bool Modified = false;
				const X86Subtarget &Subtarget = MF.getSubtarget<X86Subtarget>();
				const X86InstrInfo *TII = Subtarget.getInstrInfo();
				for (MachineBasicBlock &MBB : MF) {
				MachineInstr *FirstTerminator = nullptr;

				for (auto &MI : MBB) {
				// We want to put an LFENCE before any instruction that
				// may load or store. This LFENCE is intended to avoid leaking any secret
				sconstabUnsubmitted Not Done Reply Inline Actions There should probably be a CFI check here, e.g.: if (!Subtarget.useLVIControlFlowIntegrity()) { report_fatal_error("SESES must be used with -mlvi-cfi", false); } sconstab: There should probably be a CFI check here, e.g.: ``` if (!Subtarget.useLVIControlFlowIntegrity…
				// data due to a given load or store. This results in closing the cache
				// and memory timing side channels. We will treat terminators that load
				// or store separately.
				if (MI.mayLoadOrStore() && !MI.isTerminator()) {
				BuildMI(MBB, MI, DebugLoc(), TII->get(X86::LFENCE));
				NumLFENCEsInserted++;
				Modified = true;
				if (OnlyFirstLFENCE) {
				break;
				}
				}
				// The following section will be LFENCEing before groups of terminators
				// that include branches. This will close the branch prediction side
				// channels since we will prevent code executing after misspeculation as
				// a result of the LFENCEs placed with this logic.

				// Keep track of the first terminator in a basic block since if we need
				// to LFENCE the terminators in this basic block we must add the
				// instruction before the first terminator in the basic block (as
				// opposed to before the terminator that indicates an LFENCE is
				// required). An example of why this is necessary is that the
				// X86InstrInfo::analyzeBranch method assumes all terminators are grouped
				// together and terminates it's analysis once the first non-termintor
				// instruction is found.
				jdoerfertUnsubmitted Done Reply Inline Actions Style: No braces around single statements. (also elsewhere) jdoerfert: Style: No braces around single statements. (also elsewhere)
				if (MI.isTerminator() && FirstTerminator == nullptr) {
				FirstTerminator = &MI;
				}

				// Look for branch instructions that will require an LFENCE to be put
				// before this basic block's terminators.
				if (!MI.isBranch() \|\| OmitBranchLFENCEs) {
				// This isn't a branch or we're not putting LFENCEs before branches.
				continue;
				}

				if (OnlyLFENCENonConst && hasConstantAddressingMode(MI)) {
				// This is a branch, but it only has constant addressing mode and we're
				// not adding LFENCEs before such branches.
				continue;
				}
				// This branch requires adding an LFENCE.
				BuildMI(MBB, FirstTerminator, DebugLoc(), TII->get(X86::LFENCE));
				NumLFENCEsInserted++;
				Modified = true;
				break;
				}
				}

				return Modified;
				}

				// This function returns whether the passed instruction uses a memory addressing
				// mode that is constant. We treat all memory addressing modes that read
				// from a register that is not %rip as non-constant. Note that the use
				// of the EFLAGS register results in an addressing mode being considered
				// non-constant, therefore all JCC instructions will return false from this
				// function since one of their operands will always be the EFLAGS register.
				static bool hasConstantAddressingMode(const MachineInstr &MI) {
				for (const MachineOperand &MO : MI.uses()) {
				craig.topperUnsubmitted Not Done Reply Inline Actions Can this be ordered earlier to avoid the forward declaration? craig.topper: Can this be ordered earlier to avoid the forward declaration?
				if (MO.isReg() && X86::RIP != MO.getReg()) {
				return false;
				}
				}
				return true;
				}

				FunctionPass *llvm::createX86SpeculativeExecutionSideEffectSuppressionPass() {
				return new X86SpeculativeExecutionSideEffectSuppressionPass();
				}

				INITIALIZE_PASS(X86SpeculativeExecutionSideEffectSuppressionPass, "x86-seses",
				"X86 Speculative Execution Side Effect Suppresion Pass", false,
				false)

llvm/lib/Target/X86/X86TargetMachine.cpp

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeX86Target() {
initializeFPSPass(PR);		initializeFPSPass(PR);
initializeX86CallFrameOptimizationPass(PR);		initializeX86CallFrameOptimizationPass(PR);
initializeX86CmovConverterPassPass(PR);		initializeX86CmovConverterPassPass(PR);
initializeX86ExpandPseudoPass(PR);		initializeX86ExpandPseudoPass(PR);
initializeX86ExecutionDomainFixPass(PR);		initializeX86ExecutionDomainFixPass(PR);
initializeX86DomainReassignmentPass(PR);		initializeX86DomainReassignmentPass(PR);
initializeX86AvoidSFBPassPass(PR);		initializeX86AvoidSFBPassPass(PR);
initializeX86SpeculativeLoadHardeningPassPass(PR);		initializeX86SpeculativeLoadHardeningPassPass(PR);
		initializeX86SpeculativeExecutionSideEffectSuppressionPassPass(PR);
initializeX86FlagsCopyLoweringPassPass(PR);		initializeX86FlagsCopyLoweringPassPass(PR);
initializeX86CondBrFoldingPassPass(PR);		initializeX86CondBrFoldingPassPass(PR);
initializeX86OptimizeLEAPassPass(PR);		initializeX86OptimizeLEAPassPass(PR);
}		}

static std::unique_ptr<TargetLoweringObjectFile> createTLOF(const Triple &TT) {		static std::unique_ptr<TargetLoweringObjectFile> createTLOF(const Triple &TT) {
if (TT.isOSBinFormatMachO()) {		if (TT.isOSBinFormatMachO()) {
if (TT.getArch() == Triple::x86_64)		if (TT.getArch() == Triple::x86_64)
▲ Show 20 Lines • Show All 430 Lines • ▼ Show 20 Lines	void X86PassConfig::addPreEmitPass() {
addPass(createX86InsertPrefetchPass());		addPass(createX86InsertPrefetchPass());
addPass(createX86InsertX87waitPass());		addPass(createX86InsertX87waitPass());
}		}

void X86PassConfig::addPreEmitPass2() {		void X86PassConfig::addPreEmitPass2() {
const Triple &TT = TM->getTargetTriple();		const Triple &TT = TM->getTargetTriple();
const MCAsmInfo *MAI = TM->getMCAsmInfo();		const MCAsmInfo *MAI = TM->getMCAsmInfo();

		// The X86 Speculative Execution Pass must run after all control
		// flow graph modifying passes. As a result it was listed to run right before
		// the X86 Retpoline Thunks pass. The reason it must run after control flow
		// graph modifications is that the model of LFENCE in LLVM has to be updated
		// (FIXME: Update model of LFENCE). Currently the placement of this pass was
		// hand checked to ensure that the subsequent passes don't move the code
		// around the LFENCEs in a way that will hurt the correctness of this pass.
		// This placement has been shown to work based on hand inspection of the
		// codegen output.
		addPass(createX86SpeculativeExecutionSideEffectSuppressionPass());
addPass(createX86RetpolineThunksPass());		addPass(createX86RetpolineThunksPass());

// Insert extra int3 instructions after trailing call instructions to avoid		// Insert extra int3 instructions after trailing call instructions to avoid
// issues in the unwinder.		// issues in the unwinder.
if (TT.isOSWindows() && TT.getArch() == Triple::x86_64)		if (TT.isOSWindows() && TT.getArch() == Triple::x86_64)
addPass(createX86AvoidTrailingCallPass());		addPass(createX86AvoidTrailingCallPass());

// Verify basic block incoming and outgoing cfa offset and register values and		// Verify basic block incoming and outgoing cfa offset and register values and
Show All 14 Lines

llvm/test/CodeGen/X86/O0-pipeline.ll

	Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: X86 Indirect Branch Tracking			; CHECK-NEXT: X86 Indirect Branch Tracking
	; CHECK-NEXT: X86 vzeroupper inserter			; CHECK-NEXT: X86 vzeroupper inserter
	; CHECK-NEXT: X86 Discriminate Memory Operands			; CHECK-NEXT: X86 Discriminate Memory Operands
	; CHECK-NEXT: X86 Insert Cache Prefetches			; CHECK-NEXT: X86 Insert Cache Prefetches
	; CHECK-NEXT: X86 insert wait instruction			; CHECK-NEXT: X86 insert wait instruction
	; CHECK-NEXT: Contiguously Lay Out Funclets			; CHECK-NEXT: Contiguously Lay Out Funclets
	; CHECK-NEXT: StackMap Liveness Analysis			; CHECK-NEXT: StackMap Liveness Analysis
	; CHECK-NEXT: Live DEBUG_VALUE analysis			; CHECK-NEXT: Live DEBUG_VALUE analysis
				; CHECK-NEXT: X86 Speculative Execution Side Effect Suppression Pass
	; CHECK-NEXT: X86 Retpoline Thunks			; CHECK-NEXT: X86 Retpoline Thunks
	; CHECK-NEXT: Check CFA info and insert CFI instructions if needed			; CHECK-NEXT: Check CFA info and insert CFI instructions if needed
	; CHECK-NEXT: Lazy Machine Block Frequency Analysis			; CHECK-NEXT: Lazy Machine Block Frequency Analysis
	; CHECK-NEXT: Machine Optimization Remark Emitter			; CHECK-NEXT: Machine Optimization Remark Emitter
	; CHECK-NEXT: X86 Assembly Printer			; CHECK-NEXT: X86 Assembly Printer
	; CHECK-NEXT: Free MachineFunction			; CHECK-NEXT: Free MachineFunction

	define void @f() {			define void @f() {
	ret void			ret void
	}			}

llvm/test/CodeGen/X86/O3-pipeline.ll

	Show First 20 Lines • Show All 175 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: X86 LEA Fixup			; CHECK-NEXT: X86 LEA Fixup
	; CHECK-NEXT: Compressing EVEX instrs to VEX encoding when possible			; CHECK-NEXT: Compressing EVEX instrs to VEX encoding when possible
	; CHECK-NEXT: X86 Discriminate Memory Operands			; CHECK-NEXT: X86 Discriminate Memory Operands
	; CHECK-NEXT: X86 Insert Cache Prefetches			; CHECK-NEXT: X86 Insert Cache Prefetches
	; CHECK-NEXT: X86 insert wait instruction			; CHECK-NEXT: X86 insert wait instruction
	; CHECK-NEXT: Contiguously Lay Out Funclets			; CHECK-NEXT: Contiguously Lay Out Funclets
	; CHECK-NEXT: StackMap Liveness Analysis			; CHECK-NEXT: StackMap Liveness Analysis
	; CHECK-NEXT: Live DEBUG_VALUE analysis			; CHECK-NEXT: Live DEBUG_VALUE analysis
				; CHECK-NEXT: X86 Speculative Execution Side Effect Suppression Pass
				craig.topperUnsubmitted Done Reply Inline Actions Drop " Pass". Doesn't look like most other passes refer to themselves as Pass craig.topper: Drop " Pass". Doesn't look like most other passes refer to themselves as Pass
	; CHECK-NEXT: X86 Retpoline Thunks			; CHECK-NEXT: X86 Retpoline Thunks
	; CHECK-NEXT: Check CFA info and insert CFI instructions if needed			; CHECK-NEXT: Check CFA info and insert CFI instructions if needed
	; CHECK-NEXT: Lazy Machine Block Frequency Analysis			; CHECK-NEXT: Lazy Machine Block Frequency Analysis
	; CHECK-NEXT: Machine Optimization Remark Emitter			; CHECK-NEXT: Machine Optimization Remark Emitter
	; CHECK-NEXT: X86 Assembly Printer			; CHECK-NEXT: X86 Assembly Printer
	; CHECK-NEXT: Free MachineFunction			; CHECK-NEXT: Free MachineFunction

	define void @f() {			define void @f() {
	ret void			ret void
	}			}

llvm/test/CodeGen/X86/speculative-execution-side-effect-suppression-omit-branch-lfences.ll

This file was added.

				; RUN: llc -mtriple=x86_64-unknown-linux-gnu -x86-seses-enable -x86-seses-omit-branch-lfences %s -o - \| FileCheck %s

				define dso_local void @_Z4buzzv() {
				entry:
				%a = alloca i32, align 4
				store i32 10, i32* %a, align 4
				ret void
				}

				define dso_local i32 @_Z3barPi(i32* %p) {
				entry:
				%retval = alloca i32, align 4
				%p.addr = alloca i32*, align 8
				%a = alloca [4 x i32], align 16
				%len = alloca i32, align 4
				store i32* %p, i32** %p.addr, align 8
				%0 = bitcast [4 x i32]* %a to i8*
				store i32 4, i32* %len, align 4
				%1 = load i32, i32* %p.addr, align 8
				%2 = load i32, i32* %1, align 4
				%3 = load i32, i32* %len, align 4
				%cmp = icmp slt i32 %2, %3
				br i1 %cmp, label %if.then, label %if.else

				if.then: ; preds = %entry
				%4 = load i32, i32* %p.addr, align 8
				%5 = load i32, i32* %4, align 4
				%idxprom = sext i32 %5 to i64
				%arrayidx = getelementptr inbounds [4 x i32], [4 x i32]* %a, i64 0, i64 %idxprom
				%6 = load i32, i32* %arrayidx, align 4
				store i32 %6, i32* %retval, align 4
				br label %return

				if.else: ; preds = %entry
				store i32 -1, i32* %retval, align 4
				br label %return

				return: ; preds = %if.else, %if.then
				%7 = load i32, i32* %retval, align 4
				ret i32 %7
				}

				define dso_local i32 (i32) @_Z3bazv() {
				entry:
				%p = alloca i32 (i32), align 8
				store i32 (i32) @_Z3barPi, i32 (i32)* %p, align 8
				call void asm sideeffect "", "=m,m,~{dirflag},~{fpsr},~{flags}"(i32 (i32)* %p, i32 (i32)* %p) #3, !srcloc !2
				%0 = load i32 (i32), i32 (i32)* %p, align 8
				ret i32 (i32) %0
				}

				define dso_local void @_Z3fooPi(i32* %p) {
				entry:
				%p.addr = alloca i32*, align 8
				%t = alloca i32 (i32), align 8
				store i32* %p, i32** %p.addr, align 8
				%call = call i32 (i32) @_Z3bazv()
				store i32 (i32) %call, i32 (i32)* %t, align 8
				%0 = load i32 (i32), i32 (i32)* %t, align 8
				%1 = load i32, i32* %p.addr, align 8
				%call1 = call i32 %0(i32* %1)
				ret void
				}

				!2 = !{i32 233}

				; CHECK: .globl _Z4buzzv # -- Begin function _Z4buzzv
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z4buzzv,@function
				; CHECK:_Z4buzzv: # @_Z4buzzv
				; CHECK:.L_Z4buzzv$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movl $10, -4(%rsp)
				; CHECK: retq
				; CHECK:.Lfunc_end0:
				; CHECK: .size _Z4buzzv, .Lfunc_end0-_Z4buzzv
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3barPi # -- Begin function _Z3barPi
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3barPi,@function
				; CHECK:_Z3barPi: # @_Z3barPi
				; CHECK:.L_Z3barPi$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movq %rdi, -40(%rsp)
				; CHECK: lfence
				; CHECK: movl $4, -28(%rsp)
				; CHECK: lfence
				; CHECK: cmpl $3, (%rdi)
				; CHECK: jg .LBB1_2
				; CHECK:# %bb.1: # %if.then
				; CHECK: lfence
				; CHECK: movq -40(%rsp), %rax
				; CHECK: lfence
				; CHECK: movslq (%rax), %rax
				; CHECK: lfence
				; CHECK: movl -24(%rsp,%rax,4), %eax
				; CHECK: lfence
				; CHECK: movl %eax, -44(%rsp)
				; CHECK: lfence
				; CHECK: movl -44(%rsp), %eax
				; CHECK: retq
				; CHECK:.LBB1_2: # %if.else
				; CHECK: lfence
				; CHECK: movl $-1, -44(%rsp)
				; CHECK: lfence
				; CHECK: movl -44(%rsp), %eax
				; CHECK: retq
				; CHECK:.Lfunc_end1:
				; CHECK: .size _Z3barPi, .Lfunc_end1-_Z3barPi
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3bazv # -- Begin function _Z3bazv
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3bazv,@function
				; CHECK:_Z3bazv: # @_Z3bazv
				; CHECK:.L_Z3bazv$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movq $.L_Z3barPi$local, -8(%rsp)
				; CHECK: lfence
				; CHECK: #APP
				; CHECK: #NO_APP
				; CHECK: lfence
				; CHECK: movq -8(%rsp), %rax
				; CHECK: retq
				; CHECK:.Lfunc_end2:
				; CHECK: .size _Z3bazv, .Lfunc_end2-_Z3bazv
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3fooPi # -- Begin function _Z3fooPi
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3fooPi,@function
				; CHECK:_Z3fooPi: # @_Z3fooPi
				; CHECK:.L_Z3fooPi$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: subq $24, %rsp
				; CHECK: .cfi_def_cfa_offset 32
				; CHECK: lfence
				; CHECK: movq %rdi, 8(%rsp)
				; CHECK: callq .L_Z3bazv$local
				; CHECK: lfence
				; CHECK: movq %rax, 16(%rsp)
				; CHECK: lfence
				; CHECK: movq 8(%rsp), %rdi
				; CHECK: callq *%rax
				; CHECK: addq $24, %rsp
				; CHECK: .cfi_def_cfa_offset 8
				; CHECK: retq
				; CHECK:.Lfunc_end3:
				; CHECK: .size _Z3fooPi, .Lfunc_end3-_Z3fooPi
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .section ".note.GNU-stack","",@progbits

llvm/test/CodeGen/X86/speculative-execution-side-effect-suppression-only-first-lfence.ll

This file was added.

				; RUN: llc -mtriple=x86_64-unknown-linux-gnu -x86-seses-enable -x86-seses-only-first-lfence %s -o - \| FileCheck %s

				define dso_local void @_Z4buzzv() {
				entry:
				%a = alloca i32, align 4
				store i32 10, i32* %a, align 4
				ret void
				}

				define dso_local i32 @_Z3barPi(i32* %p) {
				entry:
				%retval = alloca i32, align 4
				%p.addr = alloca i32*, align 8
				%a = alloca [4 x i32], align 16
				%len = alloca i32, align 4
				store i32* %p, i32** %p.addr, align 8
				%0 = bitcast [4 x i32]* %a to i8*
				store i32 4, i32* %len, align 4
				%1 = load i32, i32* %p.addr, align 8
				%2 = load i32, i32* %1, align 4
				%3 = load i32, i32* %len, align 4
				%cmp = icmp slt i32 %2, %3
				br i1 %cmp, label %if.then, label %if.else

				if.then: ; preds = %entry
				%4 = load i32, i32* %p.addr, align 8
				%5 = load i32, i32* %4, align 4
				%idxprom = sext i32 %5 to i64
				%arrayidx = getelementptr inbounds [4 x i32], [4 x i32]* %a, i64 0, i64 %idxprom
				%6 = load i32, i32* %arrayidx, align 4
				store i32 %6, i32* %retval, align 4
				br label %return

				if.else: ; preds = %entry
				store i32 -1, i32* %retval, align 4
				br label %return

				return: ; preds = %if.else, %if.then
				%7 = load i32, i32* %retval, align 4
				ret i32 %7
				}

				define dso_local i32 (i32) @_Z3bazv() {
				entry:
				%p = alloca i32 (i32), align 8
				store i32 (i32) @_Z3barPi, i32 (i32)* %p, align 8
				call void asm sideeffect "", "=m,m,~{dirflag},~{fpsr},~{flags}"(i32 (i32)* %p, i32 (i32)* %p) #3, !srcloc !2
				%0 = load i32 (i32), i32 (i32)* %p, align 8
				ret i32 (i32) %0
				}

				define dso_local void @_Z3fooPi(i32* %p) {
				entry:
				%p.addr = alloca i32*, align 8
				%t = alloca i32 (i32), align 8
				store i32* %p, i32** %p.addr, align 8
				%call = call i32 (i32) @_Z3bazv()
				store i32 (i32) %call, i32 (i32)* %t, align 8
				%0 = load i32 (i32), i32 (i32)* %t, align 8
				%1 = load i32, i32* %p.addr, align 8
				%call1 = call i32 %0(i32* %1)
				ret void
				}

				!2 = !{i32 233}


				; CHECK: .globl _Z4buzzv # -- Begin function _Z4buzzv
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z4buzzv,@function
				; CHECK:_Z4buzzv: # @_Z4buzzv
				; CHECK:.L_Z4buzzv$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movl $10, -4(%rsp)
				; CHECK: retq
				; CHECK:.Lfunc_end0:
				; CHECK: .size _Z4buzzv, .Lfunc_end0-_Z4buzzv
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3barPi # -- Begin function _Z3barPi
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3barPi,@function
				; CHECK:_Z3barPi: # @_Z3barPi
				; CHECK:.L_Z3barPi$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movq %rdi, -40(%rsp)
				; CHECK: movl $4, -28(%rsp)
				; CHECK: cmpl $3, (%rdi)
				; CHECK: jg .LBB1_2
				; CHECK:# %bb.1: # %if.then
				; CHECK: lfence
				; CHECK: movq -40(%rsp), %rax
				; CHECK: movslq (%rax), %rax
				; CHECK: movl -24(%rsp,%rax,4), %eax
				; CHECK: movl %eax, -44(%rsp)
				; CHECK: movl -44(%rsp), %eax
				; CHECK: retq
				; CHECK:.LBB1_2: # %if.else
				; CHECK: lfence
				; CHECK: movl $-1, -44(%rsp)
				; CHECK: movl -44(%rsp), %eax
				; CHECK: retq
				; CHECK:.Lfunc_end1:
				; CHECK: .size _Z3barPi, .Lfunc_end1-_Z3barPi
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3bazv # -- Begin function _Z3bazv
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3bazv,@function
				; CHECK:_Z3bazv: # @_Z3bazv
				; CHECK:.L_Z3bazv$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movq $.L_Z3barPi$local, -8(%rsp)
				; CHECK: #APP
				; CHECK: #NO_APP
				; CHECK: movq -8(%rsp), %rax
				; CHECK: retq
				; CHECK:.Lfunc_end2:
				; CHECK: .size _Z3bazv, .Lfunc_end2-_Z3bazv
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3fooPi # -- Begin function _Z3fooPi
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3fooPi,@function
				; CHECK:_Z3fooPi: # @_Z3fooPi
				; CHECK:.L_Z3fooPi$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: subq $24, %rsp
				; CHECK: .cfi_def_cfa_offset 32
				; CHECK: lfence
				; CHECK: movq %rdi, 8(%rsp)
				; CHECK: callq .L_Z3bazv$local
				; CHECK: movq %rax, 16(%rsp)
				; CHECK: movq 8(%rsp), %rdi
				; CHECK: callq *%rax
				; CHECK: addq $24, %rsp
				; CHECK: .cfi_def_cfa_offset 8
				; CHECK: retq
				; CHECK:.Lfunc_end3:
				; CHECK: .size _Z3fooPi, .Lfunc_end3-_Z3fooPi
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .section ".note.GNU-stack","",@progbits

llvm/test/CodeGen/X86/speculative-execution-side-effect-suppression-only-lfence-non-const.ll

This file was added.

				; RUN: llc -mtriple=x86_64-unknown-linux-gnu -x86-seses-enable -x86-seses-only-lfence-non-const %s -o - \| FileCheck %s

				define dso_local void @_Z4buzzv() {
				entry:
				%a = alloca i32, align 4
				store i32 10, i32* %a, align 4
				ret void
				}

				define dso_local i32 @_Z3barPi(i32* %p) {
				entry:
				%retval = alloca i32, align 4
				%p.addr = alloca i32*, align 8
				%a = alloca [4 x i32], align 16
				%len = alloca i32, align 4
				store i32* %p, i32** %p.addr, align 8
				%0 = bitcast [4 x i32]* %a to i8*
				store i32 4, i32* %len, align 4
				%1 = load i32, i32* %p.addr, align 8
				%2 = load i32, i32* %1, align 4
				%3 = load i32, i32* %len, align 4
				%cmp = icmp slt i32 %2, %3
				br i1 %cmp, label %if.then, label %if.else

				if.then: ; preds = %entry
				%4 = load i32, i32* %p.addr, align 8
				%5 = load i32, i32* %4, align 4
				%idxprom = sext i32 %5 to i64
				%arrayidx = getelementptr inbounds [4 x i32], [4 x i32]* %a, i64 0, i64 %idxprom
				%6 = load i32, i32* %arrayidx, align 4
				store i32 %6, i32* %retval, align 4
				br label %return

				if.else: ; preds = %entry
				store i32 -1, i32* %retval, align 4
				br label %return

				return: ; preds = %if.else, %if.then
				%7 = load i32, i32* %retval, align 4
				ret i32 %7
				}

				define dso_local i32 (i32) @_Z3bazv() {
				entry:
				%p = alloca i32 (i32), align 8
				store i32 (i32) @_Z3barPi, i32 (i32)* %p, align 8
				call void asm sideeffect "", "=m,m,~{dirflag},~{fpsr},~{flags}"(i32 (i32)* %p, i32 (i32)* %p) #3, !srcloc !2
				%0 = load i32 (i32), i32 (i32)* %p, align 8
				ret i32 (i32) %0
				}

				define dso_local void @_Z3fooPi(i32* %p) {
				entry:
				%p.addr = alloca i32*, align 8
				%t = alloca i32 (i32), align 8
				store i32* %p, i32** %p.addr, align 8
				%call = call i32 (i32) @_Z3bazv()
				store i32 (i32) %call, i32 (i32)* %t, align 8
				%0 = load i32 (i32), i32 (i32)* %t, align 8
				%1 = load i32, i32* %p.addr, align 8
				%call1 = call i32 %0(i32* %1)
				ret void
				}

				!2 = !{i32 233}

				; CHECK: .globl _Z4buzzv # -- Begin function _Z4buzzv
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z4buzzv,@function
				; CHECK:_Z4buzzv: # @_Z4buzzv
				; CHECK:.L_Z4buzzv$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movl $10, -4(%rsp)
				; CHECK: retq
				; CHECK:.Lfunc_end0:
				; CHECK: .size _Z4buzzv, .Lfunc_end0-_Z4buzzv
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3barPi # -- Begin function _Z3barPi
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3barPi,@function
				; CHECK:_Z3barPi: # @_Z3barPi
				; CHECK:.L_Z3barPi$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movq %rdi, -40(%rsp)
				; CHECK: lfence
				; CHECK: movl $4, -28(%rsp)
				; CHECK: lfence
				; CHECK: cmpl $3, (%rdi)
				; CHECK: lfence
				; CHECK: jg .LBB1_2
				; CHECK:# %bb.1: # %if.then
				; CHECK: lfence
				; CHECK: movq -40(%rsp), %rax
				; CHECK: lfence
				; CHECK: movslq (%rax), %rax
				; CHECK: lfence
				; CHECK: movl -24(%rsp,%rax,4), %eax
				; CHECK: lfence
				; CHECK: movl %eax, -44(%rsp)
				; CHECK: lfence
				; CHECK: movl -44(%rsp), %eax
				; CHECK: retq
				; CHECK:.LBB1_2: # %if.else
				; CHECK: lfence
				; CHECK: movl $-1, -44(%rsp)
				; CHECK: lfence
				; CHECK: movl -44(%rsp), %eax
				; CHECK: retq
				; CHECK:.Lfunc_end1:
				; CHECK: .size _Z3barPi, .Lfunc_end1-_Z3barPi
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3bazv # -- Begin function _Z3bazv
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3bazv,@function
				; CHECK:_Z3bazv: # @_Z3bazv
				; CHECK:.L_Z3bazv$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movq $.L_Z3barPi$local, -8(%rsp)
				; CHECK: lfence
				; CHECK: #APP
				; CHECK: #NO_APP
				; CHECK: lfence
				; CHECK: movq -8(%rsp), %rax
				; CHECK: retq
				; CHECK:.Lfunc_end2:
				; CHECK: .size _Z3bazv, .Lfunc_end2-_Z3bazv
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3fooPi # -- Begin function _Z3fooPi
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3fooPi,@function
				; CHECK:_Z3fooPi: # @_Z3fooPi
				; CHECK:.L_Z3fooPi$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: subq $24, %rsp
				; CHECK: .cfi_def_cfa_offset 32
				; CHECK: lfence
				; CHECK: movq %rdi, 8(%rsp)
				; CHECK: callq .L_Z3bazv$local
				; CHECK: lfence
				; CHECK: movq %rax, 16(%rsp)
				; CHECK: lfence
				; CHECK: movq 8(%rsp), %rdi
				; CHECK: callq *%rax
				; CHECK: addq $24, %rsp
				; CHECK: .cfi_def_cfa_offset 8
				; CHECK: retq
				; CHECK:.Lfunc_end3:
				; CHECK: .size _Z3fooPi, .Lfunc_end3-_Z3fooPi
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .section ".note.GNU-stack","",@progbits

llvm/test/CodeGen/X86/speculative-execution-side-effect-suppression.ll

This file was added.

				; RUN: llc -mtriple=x86_64-unknown-linux-gnu -x86-seses-enable %s -o - \| FileCheck %s

				define dso_local void @_Z4buzzv() {
				entry:
				%a = alloca i32, align 4
				store i32 10, i32* %a, align 4
				ret void
				}

				define dso_local i32 @_Z3barPi(i32* %p) {
				entry:
				%retval = alloca i32, align 4
				%p.addr = alloca i32*, align 8
				%a = alloca [4 x i32], align 16
				%len = alloca i32, align 4
				store i32* %p, i32** %p.addr, align 8
				%0 = bitcast [4 x i32]* %a to i8*
				store i32 4, i32* %len, align 4
				%1 = load i32, i32* %p.addr, align 8
				%2 = load i32, i32* %1, align 4
				%3 = load i32, i32* %len, align 4
				%cmp = icmp slt i32 %2, %3
				br i1 %cmp, label %if.then, label %if.else

				if.then: ; preds = %entry
				%4 = load i32, i32* %p.addr, align 8
				%5 = load i32, i32* %4, align 4
				%idxprom = sext i32 %5 to i64
				%arrayidx = getelementptr inbounds [4 x i32], [4 x i32]* %a, i64 0, i64 %idxprom
				%6 = load i32, i32* %arrayidx, align 4
				store i32 %6, i32* %retval, align 4
				br label %return

				if.else: ; preds = %entry
				store i32 -1, i32* %retval, align 4
				br label %return

				return: ; preds = %if.else, %if.then
				%7 = load i32, i32* %retval, align 4
				ret i32 %7
				}

				define dso_local i32 (i32) @_Z3bazv() {
				entry:
				%p = alloca i32 (i32), align 8
				store i32 (i32) @_Z3barPi, i32 (i32)* %p, align 8
				call void asm sideeffect "", "=m,m,~{dirflag},~{fpsr},~{flags}"(i32 (i32)* %p, i32 (i32)* %p) #3, !srcloc !2
				%0 = load i32 (i32), i32 (i32)* %p, align 8
				ret i32 (i32) %0
				}

				define dso_local void @_Z3fooPi(i32* %p) {
				entry:
				%p.addr = alloca i32*, align 8
				%t = alloca i32 (i32), align 8
				store i32* %p, i32** %p.addr, align 8
				%call = call i32 (i32) @_Z3bazv()
				store i32 (i32) %call, i32 (i32)* %t, align 8
				%0 = load i32 (i32), i32 (i32)* %t, align 8
				%1 = load i32, i32* %p.addr, align 8
				%call1 = call i32 %0(i32* %1)
				ret void
				}

				!2 = !{i32 233}

				; CHECK: .globl _Z4buzzv # -- Begin function _Z4buzzv
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z4buzzv,@function
				; CHECK:_Z4buzzv: # @_Z4buzzv
				; CHECK:.L_Z4buzzv$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movl $10, -4(%rsp)
				; CHECK: retq
				; CHECK:.Lfunc_end0:
				; CHECK: .size _Z4buzzv, .Lfunc_end0-_Z4buzzv
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3barPi # -- Begin function _Z3barPi
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3barPi,@function
				; CHECK:_Z3barPi: # @_Z3barPi
				; CHECK:.L_Z3barPi$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movq %rdi, -40(%rsp)
				; CHECK: lfence
				; CHECK: movl $4, -28(%rsp)
				; CHECK: lfence
				; CHECK: cmpl $3, (%rdi)
				; CHECK: lfence
				; CHECK: jg .LBB1_2
				; CHECK:# %bb.1: # %if.then
				; CHECK: lfence
				; CHECK: movq -40(%rsp), %rax
				; CHECK: lfence
				; CHECK: movslq (%rax), %rax
				; CHECK: lfence
				; CHECK: movl -24(%rsp,%rax,4), %eax
				; CHECK: lfence
				; CHECK: movl %eax, -44(%rsp)
				; CHECK: lfence
				; CHECK: movl -44(%rsp), %eax
				; CHECK: retq
				; CHECK:.LBB1_2: # %if.else
				; CHECK: lfence
				; CHECK: movl $-1, -44(%rsp)
				; CHECK: lfence
				; CHECK: movl -44(%rsp), %eax
				; CHECK: retq
				; CHECK:.Lfunc_end1:
				; CHECK: .size _Z3barPi, .Lfunc_end1-_Z3barPi
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3bazv # -- Begin function _Z3bazv
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3bazv,@function
				; CHECK:_Z3bazv: # @_Z3bazv
				; CHECK:.L_Z3bazv$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: lfence
				; CHECK: movq $.L_Z3barPi$local, -8(%rsp)
				; CHECK: lfence
				; CHECK: #APP
				; CHECK: #NO_APP
				; CHECK: lfence
				; CHECK: movq -8(%rsp), %rax
				; CHECK: retq
				; CHECK:.Lfunc_end2:
				; CHECK: .size _Z3bazv, .Lfunc_end2-_Z3bazv
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .globl _Z3fooPi # -- Begin function _Z3fooPi
				; CHECK: .p2align 4, 0x90
				; CHECK: .type _Z3fooPi,@function
				; CHECK:_Z3fooPi: # @_Z3fooPi
				; CHECK:.L_Z3fooPi$local:
				; CHECK: .cfi_startproc
				; CHECK:# %bb.0: # %entry
				; CHECK: subq $24, %rsp
				; CHECK: .cfi_def_cfa_offset 32
				; CHECK: lfence
				; CHECK: movq %rdi, 8(%rsp)
				; CHECK: callq .L_Z3bazv$local
				; CHECK: lfence
				; CHECK: movq %rax, 16(%rsp)
				; CHECK: lfence
				; CHECK: movq 8(%rsp), %rdi
				; CHECK: callq *%rax
				; CHECK: addq $24, %rsp
				; CHECK: .cfi_def_cfa_offset 8
				; CHECK: retq
				; CHECK:.Lfunc_end3:
				; CHECK: .size _Z3fooPi, .Lfunc_end3-_Z3fooPi
				; CHECK: .cfi_endproc
				; CHECK: # -- End function
				; CHECK: .section ".note.GNU-stack","",@progbits
				MaskRayUnsubmitted Done Reply Inline Actions .note.GNU-stack is not needed. MaskRay: .note.GNU-stack is not needed.

This is an archive of the discontinued LLVM Phabricator instance.

[x86][seses] Introduce SESES pass for LVIClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 249431

llvm/lib/Target/X86/CMakeLists.txt

llvm/lib/Target/X86/X86.h

llvm/lib/Target/X86/X86SpeculativeExecutionSideEffectSuppression.cpp

llvm/lib/Target/X86/X86TargetMachine.cpp

llvm/test/CodeGen/X86/O0-pipeline.ll

llvm/test/CodeGen/X86/O3-pipeline.ll

llvm/test/CodeGen/X86/speculative-execution-side-effect-suppression-omit-branch-lfences.ll

llvm/test/CodeGen/X86/speculative-execution-side-effect-suppression-only-first-lfence.ll

llvm/test/CodeGen/X86/speculative-execution-side-effect-suppression-only-lfence-non-const.ll

llvm/test/CodeGen/X86/speculative-execution-side-effect-suppression.ll

[x86][seses] Introduce SESES pass for LVI
ClosedPublic