This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Target/X86/
-
Target/
-
X86/
-
X86.td
-
X86FixupLEAs.cpp
-
X86Subtarget.h
-
X86TargetMachine.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
leaFixup32.mir
-
leaFixup64.mir

Differential D32277

Replace slow LEA instructions in X86
ClosedPublic

Authored by lsaba on Apr 20 2017, 1:45 AM.

Download Raw Diff

Details

Reviewers

RKSimon
zvi
zansari
aaboud

Commits

rG2ea271b54a70: [X86] Replace slow LEA instructions in X86
rG52e892577d59: [X86] Replace slow LEA instructions in X86
rL303333: [X86] Replace slow LEA instructions in X86
rL303183: [X86] Replace slow LEA instructions in X86

Summary

According to Intel's Optimization Reference Manual for SNB+:
" For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must
  dispatch via port 1:
- LEA that has all three source operands: base, index, and offset
- LEA that uses base and index registers where the base is EBP, RBP,or R13
- LEA that uses RIP relative addressing mode
- LEA that uses 16-bit addressing mode "

This patch currently handles the first 2 cases only.

Diff Detail

Repository: rL LLVM

Event Timeline

lsaba created this revision.Apr 20 2017, 1:45 AM

lsaba added a reviewer: RKSimon.Apr 20 2017, 1:56 AM

Could you please also add some negative tests when you cannot do such transformation?
For example, involving eflags.

We report our observation regarding this issue.
We observed 10% speed up for the Queens benchmark in Nightly Test by simply swapping r13 and r14.

Here are more details.

LLVM & Clang: 4.0 (official release)
Queens.c: SingleSource/Benchmarks/Stanford/Queens.c in Nightly Tests.
Queens.s: clang -O3 -S Queen.c
Queens.swap.s: obtained by simply swapping r13 and r14 in Queen.s

We compiled "Queen.s" and "Queens.swap.s" using clang 4.0 (and gcc 5.4.0 too) and observed that Queens.swap.s is 10% faster than Queens.s in the following machine.

CPU: Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz
OS: Ubuntu 16.04
We tested the same on several other machines and observed consistent speed up between 2 ~ 10 % depending on machines.

Thanks.

Queens.c3 KBDownload

Queens.s14 KBDownload

Queens.swap.s14 KBDownload

result.txt9 KBDownload

In D32277#731870, @aqjune wrote:

We report our observation regarding this issue.
...

How does your comment relate to this patch?

zvi added inline comments.Apr 20 2017, 4:09 AM

lib/Target/X86/X86FixupLEAs.cpp
110 ↗	(On Diff #95903)	Method name should start with a lowercase character
200 ↗	(On Diff #95903)	Maybe pass a MachineBasicBlock &MBB instead?
201 ↗	(On Diff #95903)	Itr -> InsertBefore
203 ↗	(On Diff #95903)	What is the benefit of dumping every instruction added? If you do keep the print, please make it more descriptive. I know this was there before you moved the code, but still.
204 ↗	(On Diff #95903)	return NewMI
299 ↗	(On Diff #95903)	This function and others below can take const MachineInstr& arguments.
318 ↗	(On Diff #95903)	Does it make more sense to check if isLEA() before calling this function and document that MI is a LEA instruction?
367 ↗	(On Diff #95903)	Can you avoid the need for this function by using TII::copyPhysReg?
569 ↗	(On Diff #95903)	Is it safe to erase while iterating over instructions? Maybe erase the instructions after you visited all the interesting instructions?
578 ↗	(On Diff #95903)	You don't need to create an instruction and then insert it using your Insert function. There is already a MachineInstrBuilder::BuildMI overload that does that

efriedma added a subscriber: efriedma.Apr 20 2017, 12:10 PM

efriedma added inline comments.

lib/Target/X86/X86FixupLEAs.cpp
296 ↗	(On Diff #95903)	Do you need to check for R13D?

@skatkov Well, the assemblies also included leal with r13 register as well. :)

102     leal    8(%r14), %eax

102     leal    8(%r13), %eax

The performance gap may be due to the instruction, but I'm not sure. (actually, converting r14 to r13 increased performance in this case, but I have no idea what's happening inside CPU..)

Interesting... it would be nice to know why :)

Anyway I still believe that some negative tests should also be added here. For example for the sequence
cmp
lea
jb

lea cannot be converted to add due to add implicitly kills the eflags and defines it as well while lea does not do it.

RKSimon added inline comments.Apr 21 2017, 2:37 AM

lib/Target/X86/X86FixupLEAs.cpp
633 ↗	(On Diff #95903)	Is it a good idea to directly associate a general (and very vague....) feature bit with a specific set of targets like this?

D32352 is looking at more aggressive conversion of IMUL to multiple LEA instructions.

In D32277#733146, @aqjune wrote:
@skatkov Well, the assemblies also included leal with r13 register as well. :)
102     leal    8(%r14), %eax
->
102     leal    8(%r13), %eax
The performance gap may be due to the instruction, but I'm not sure. (actually, converting r14 to r13 increased performance in this case, but I have no idea what's happening inside CPU..)

I double checked these two instruction and performed a test of my own, the r13 version is not slower, looks like the problem is somewhere else in the code

lib/Target/X86/X86FixupLEAs.cpp
633 ↗	(On Diff #95903)	I am not sure I understand the comment, are you referring to the usage of the SlowLEA feature specifically? or to limiting the optimization to a set of targets using a feature in general? Do you have other suggestions?

In D32277#735719, @RKSimon wrote:

D32352 is looking at more aggressive conversion of IMUL to multiple LEA instructions.

Thanks for notifying, the patch does not contain any 3-Operand LEAs as far as I can see, the only case we need to be careful with is for 2-Operand LEA with RBP/R13/EBP as a base register, since this is determined only after RA, I am thinking it's better to let my patch fix those cases rather than preventing that patch from running on the problematic targets

In D32277#736509, @lsaba wrote:

In D32277#735719, @RKSimon wrote:

D32352 is looking at more aggressive conversion of IMUL to multiple LEA instructions.

Thanks for notifying, the patch does not contain any 3-Operand LEAs as far as I can see, the only case we need to be careful with is for 2-Operand LEA with RBP/R13/EBP as a base register, since this is determined only after RA, I am thinking it's better to let my patch fix those cases rather than preventing that patch from running on the problematic targets

Sorry for being pedantic about the naming, but for AMD architectures the 'slowlea' cases (whether it uses the ALU or AGU pipe) include scale != 1 (even for 2-ops), but it doesn't seem to be noticeably affected by RBP/R13/EBP. Hence my interest in PR32326 to try and make it easier to discriminate.

In D32277#736540, @RKSimon wrote:

In D32277#736509, @lsaba wrote:

In D32277#735719, @RKSimon wrote:

D32352 is looking at more aggressive conversion of IMUL to multiple LEA instructions.

Thanks for notifying, the patch does not contain any 3-Operand LEAs as far as I can see, the only case we need to be careful with is for 2-Operand LEA with RBP/R13/EBP as a base register, since this is determined only after RA, I am thinking it's better to let my patch fix those cases rather than preventing that patch from running on the problematic targets

Sorry for being pedantic about the naming, but for AMD architectures the 'slowlea' cases (whether it uses the ALU or AGU pipe) include scale != 1 (even for 2-ops), but it doesn't seem to be noticeably affected by RBP/R13/EBP. Hence my interest in PR32326 to try and make it easier to discriminate.

will separating this feature form the existing slowLEA feature which is used in SLM (and giving it another name) make it less confusing?

In D32277#736594, @lsaba wrote:

In D32277#736540, @RKSimon wrote:

In D32277#736509, @lsaba wrote:

In D32277#735719, @RKSimon wrote:

D32352 is looking at more aggressive conversion of IMUL to multiple LEA instructions.

Thanks for notifying, the patch does not contain any 3-Operand LEAs as far as I can see, the only case we need to be careful with is for 2-Operand LEA with RBP/R13/EBP as a base register, since this is determined only after RA, I am thinking it's better to let my patch fix those cases rather than preventing that patch from running on the problematic targets

Sorry for being pedantic about the naming, but for AMD architectures the 'slowlea' cases (whether it uses the ALU or AGU pipe) include scale != 1 (even for 2-ops), but it doesn't seem to be noticeably affected by RBP/R13/EBP. Hence my interest in PR32326 to try and make it easier to discriminate.

will separating this feature form the existing slowLEA feature which is used in SLM (and giving it another name) make it less confusing?

Yes please, we need to discriminate between different slow LEA behaviours and separate features is probably the most straightforward way to do it.

zvi mentioned this in D32352: Go to eleven.Apr 25 2017, 8:51 AM

Updated patch with ZVi's comments

In D32277#731728, @skatkov wrote:

Could you please also add some negative tests when you cannot do such transformation?
For example, involving eflags.

I added negative tests

lib/Target/X86/X86FixupLEAs.cpp
110 ↗	(On Diff #95903)	I removed this function
200 ↗	(On Diff #95903)	I removed this function
201 ↗	(On Diff #95903)	I removed this function
203 ↗	(On Diff #95903)	Keeping it since it's helpful, the description is at the beginning of the function
296 ↗	(On Diff #95903)	there are no LEA instructions in 64bit which take 32-bit register operands, so we shouldn't run into this case

In D32277#736652, @RKSimon wrote:

In D32277#736594, @lsaba wrote:

In D32277#736540, @RKSimon wrote:

In D32277#736509, @lsaba wrote:

In D32277#735719, @RKSimon wrote:

D32352 is looking at more aggressive conversion of IMUL to multiple LEA instructions.

Thanks for notifying, the patch does not contain any 3-Operand LEAs as far as I can see, the only case we need to be careful with is for 2-Operand LEA with RBP/R13/EBP as a base register, since this is determined only after RA, I am thinking it's better to let my patch fix those cases rather than preventing that patch from running on the problematic targets

Sorry for being pedantic about the naming, but for AMD architectures the 'slowlea' cases (whether it uses the ALU or AGU pipe) include scale != 1 (even for 2-ops), but it doesn't seem to be noticeably affected by RBP/R13/EBP. Hence my interest in PR32326 to try and make it easier to discriminate.

will separating this feature form the existing slowLEA feature which is used in SLM (and giving it another name) make it less confusing?

Yes please, we need to discriminate between different slow LEA behaviours and separate features is probably the most straightforward way to do it.

separated into 2 features

a.elovikov added a subscriber: a.elovikov.Apr 26 2017, 9:32 AM

I have not got the conclusion: R13 is a bad register or not?
According to what I see in the comment it is not but patch still consider it is?
Could you please clarify this?

In D32277#739023, @skatkov wrote:

I have not got the conclusion: R13 is a bad register or not?
According to what I see in the comment it is not but patch still consider it is?
Could you please clarify this?

according to the optimization guide: "LEA that uses base and index registers where the base is EBP, RBP,or R13" so R13 is bad when there is an index register as well, in aqjune's example the instruction does not have an index register (leal 8(%r13), %eax )

Thanks

zvi added inline comments.Apr 27 2017, 10:11 AM

lib/Target/X86/X86FixupLEAs.cpp
479 ↗	(On Diff #96731)	No need to pass the iterator by reference. In fact, you could instead pass MachineInstr &MI and then you won't need to define 'MI' below.
483 ↗	(On Diff #96731)	Please capitalize
490 ↗	(On Diff #96731)	This bunch of MachineOperands can all be const
514 ↗	(On Diff #96731)	You can avoid this early declaration and assignment to nullptr by defining variables local to the scope of their use. At this point, it is known the function will never return nullptr, so the assignment here is not to a default return value.
596 ↗	(On Diff #96731)	Now that you added the feature, maybe rename 'processInstructionForSNBPlus' to something like 'processforSlow3OpLEA'?
lib/Target/X86/X86Subtarget.h
253 ↗	(On Diff #96731)	Please specify in the comment all cases to which this feature applies.

lsaba updated this revision to Diff 97616.May 3 2017, 5:44 AM

lsaba marked 6 inline comments as done.

remove redundant variable

zvi added inline comments.May 8 2017, 12:14 AM

lib/Target/X86/X86FixupLEAs.cpp
285 ↗	(On Diff #98100)	These helpers can be improved by avoiding the operand index magic by passing the operands of interest as arguments. The '"heavylifting" of identifying these operands happens at the beginning of 'processInstrForSlow3OpLEA'. isRegOperand(const MachineOperand &Op) hasInefficientLEABaseReg(const MachineOperand &Base,const MachineOperand& Index) hasLEAOffset(const MachineOperand &Offset) isThreeOperandsLEA(const MachineOperand &Base, const MachineOperand &Index, const MachineOperand &Offset)
481 ↗	(On Diff #98100)	Maybe rename to LEAOpcode?
516 ↗	(On Diff #98100)	with one or two (for the 3-op LEA case) add instructions?

lsaba updated this revision to Diff 98134.May 8 2017, 1:23 AM

lsaba marked 2 inline comments as done.

lsaba marked an inline comment as done.

zvi added inline comments.May 8 2017, 2:28 AM

lib/Target/X86/X86FixupLEAs.cpp
312 ↗	(On Diff #98134)	Sorry for not pointing this out earlier, but here is another index-magic case we can avoid by passing Offset to this function, this function can possibly be changed to getADDFromLEA(int LEAOpcode, const MachineOperand &Offset, bool IsImm)
497 ↗	(On Diff #98134)	getOperand(5) - > Segment

igorb added a subscriber: igorb.May 8 2017, 2:30 AM

Updated with Zvi's comments

LGTM with some minor comments.
Thanks, Lama!

lib/Target/X86/X86FixupLEAs.cpp
308 ↗	(On Diff #98154)	This variable can be dropped if the case blocks will be terminated with returns. e.g. case X86::LEA16r:: return X86::ADD16rr;
327 ↗	(On Diff #98154)	Same as above

This revision is now accepted and ready to land.May 8 2017, 5:35 AM

lsaba marked 2 inline comments as done.May 8 2017, 6:10 AM

lsaba added inline comments.

lib/Target/X86/X86FixupLEAs.cpp
312 ↗	(On Diff #98134)	Decided to split this into 2 functions as well

Closed by commit rL303183: [X86] Replace slow LEA instructions in X86 (authored by lsaba). · Explain WhyMay 16 2017, 9:15 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

X86/

3 lines

269 lines

6 lines

2 lines

test/

CodeGen/

X86/

leaFixup32.mir

508 lines

leaFixup64.mir

1041 lines

Diff 99156

llvm/trunk/lib/Target/X86/X86.td

Show First 20 Lines • Show All 229 Lines • ▼ Show 20 Lines
// instructions, which should be avoided in favor of a MOV + register CALL/PUSH.		// instructions, which should be avoided in favor of a MOV + register CALL/PUSH.
def FeatureCallRegIndirect : SubtargetFeature<"call-reg-indirect",		def FeatureCallRegIndirect : SubtargetFeature<"call-reg-indirect",
"CallRegIndirect", "true",		"CallRegIndirect", "true",
"Call register indirect">;		"Call register indirect">;
def FeatureLEAUsesAG : SubtargetFeature<"lea-uses-ag", "LEAUsesAG", "true",		def FeatureLEAUsesAG : SubtargetFeature<"lea-uses-ag", "LEAUsesAG", "true",
"LEA instruction needs inputs at AG stage">;		"LEA instruction needs inputs at AG stage">;
def FeatureSlowLEA : SubtargetFeature<"slow-lea", "SlowLEA", "true",		def FeatureSlowLEA : SubtargetFeature<"slow-lea", "SlowLEA", "true",
"LEA instruction with certain arguments is slow">;		"LEA instruction with certain arguments is slow">;
		def FeatureSlow3OpsLEA : SubtargetFeature<"slow-3ops-lea", "Slow3OpsLEA", "true",
		"LEA instruction with 3 ops or certain registers is slow">;
def FeatureSlowIncDec : SubtargetFeature<"slow-incdec", "SlowIncDec", "true",		def FeatureSlowIncDec : SubtargetFeature<"slow-incdec", "SlowIncDec", "true",
"INC and DEC instructions are slower than ADD and SUB">;		"INC and DEC instructions are slower than ADD and SUB">;
def FeatureSoftFloat		def FeatureSoftFloat
: SubtargetFeature<"soft-float", "UseSoftFloat", "true",		: SubtargetFeature<"soft-float", "UseSoftFloat", "true",
"Use software floating point features.">;		"Use software floating point features.">;
// On some X86 processors, there is no performance hazard to writing only the		// On some X86 processors, there is no performance hazard to writing only the
// lower parts of a YMM or ZMM register without clearing the upper part.		// lower parts of a YMM or ZMM register without clearing the upper part.
def FeatureFastPartialYMMorZMMWrite		def FeatureFastPartialYMMorZMMWrite
▲ Show 20 Lines • Show All 229 Lines • ▼ Show 20 Lines	def SNBFeatures : ProcessorFeatures<[], [
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureAES,		FeatureAES,
FeatureSlowDivide64,		FeatureSlowDivide64,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureXSAVE,		FeatureXSAVE,
FeatureXSAVEOPT,		FeatureXSAVEOPT,
FeatureLAHFSAHF,		FeatureLAHFSAHF,
		FeatureSlow3OpsLEA,
FeatureFastScalarFSQRT,		FeatureFastScalarFSQRT,
FeatureFastSHLDRotate		FeatureFastSHLDRotate
]>;		]>;

class SandyBridgeProc<string Name> : ProcModel<Name, SandyBridgeModel,		class SandyBridgeProc<string Name> : ProcModel<Name, SandyBridgeModel,
SNBFeatures.Value, [		SNBFeatures.Value, [
FeatureSlowBTMem,		FeatureSlowBTMem,
FeatureSlowUAMem32		FeatureSlowUAMem32
▲ Show 20 Lines • Show All 420 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86FixupLEAs.cpp

Show All 21 Lines
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/Passes.h"		#include "llvm/CodeGen/Passes.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Target/TargetInstrInfo.h"		#include "llvm/Target/TargetInstrInfo.h"
using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "x86-fixup-LEAs"		namespace llvm {
		void initializeFixupLEAPassPass(PassRegistry &);
		}

		#define FIXUPLEA_DESC "X86 LEA Fixup"
		#define FIXUPLEA_NAME "x86-fixup-LEAs"

		#define DEBUG_TYPE FIXUPLEA_NAME

STATISTIC(NumLEAs, "Number of LEA instructions created");		STATISTIC(NumLEAs, "Number of LEA instructions created");

namespace {		namespace {
class FixupLEAPass : public MachineFunctionPass {		class FixupLEAPass : public MachineFunctionPass {
enum RegUsageState { RU_NotUsed, RU_Write, RU_Read };		enum RegUsageState { RU_NotUsed, RU_Write, RU_Read };
static char ID;
/// \brief Loop over all of the instructions in the basic block		/// \brief Loop over all of the instructions in the basic block
/// replacing applicable instructions with LEA instructions,		/// replacing applicable instructions with LEA instructions,
/// where appropriate.		/// where appropriate.
bool processBasicBlock(MachineFunction &MF, MachineFunction::iterator MFI);		bool processBasicBlock(MachineFunction &MF, MachineFunction::iterator MFI);

StringRef getPassName() const override { return "X86 LEA Fixup"; }

/// \brief Given a machine register, look for the instruction		/// \brief Given a machine register, look for the instruction
/// which writes it in the current basic block. If found,		/// which writes it in the current basic block. If found,
/// try to replace it with an equivalent LEA instruction.		/// try to replace it with an equivalent LEA instruction.
/// If replacement succeeds, then also process the newly created		/// If replacement succeeds, then also process the newly created
/// instruction.		/// instruction.
void seekLEAFixup(MachineOperand &p, MachineBasicBlock::iterator &I,		void seekLEAFixup(MachineOperand &p, MachineBasicBlock::iterator &I,
MachineFunction::iterator MFI);		MachineFunction::iterator MFI);

/// \brief Given a memory access or LEA instruction		/// \brief Given a memory access or LEA instruction
/// whose address mode uses a base and/or index register, look for		/// whose address mode uses a base and/or index register, look for
/// an opportunity to replace the instruction which sets the base or index		/// an opportunity to replace the instruction which sets the base or index
/// register with an equivalent LEA instruction.		/// register with an equivalent LEA instruction.
void processInstruction(MachineBasicBlock::iterator &I,		void processInstruction(MachineBasicBlock::iterator &I,
MachineFunction::iterator MFI);		MachineFunction::iterator MFI);

/// \brief Given a LEA instruction which is unprofitable		/// \brief Given a LEA instruction which is unprofitable
/// on Silvermont try to replace it with an equivalent ADD instruction		/// on Silvermont try to replace it with an equivalent ADD instruction
void processInstructionForSLM(MachineBasicBlock::iterator &I,		void processInstructionForSLM(MachineBasicBlock::iterator &I,
MachineFunction::iterator MFI);		MachineFunction::iterator MFI);


		/// \brief Given a LEA instruction which is unprofitable
		/// on SNB+ try to replace it with other instructions.
		/// According to Intel's Optimization Reference Manual:
		/// " For LEA instructions with three source operands and some specific
		/// situations, instruction latency has increased to 3 cycles, and must
		/// dispatch via port 1:
		/// - LEA that has all three source operands: base, index, and offset
		/// - LEA that uses base and index registers where the base is EBP, RBP,
		/// or R13
		/// - LEA that uses RIP relative addressing mode
		/// - LEA that uses 16-bit addressing mode "
		/// This function currently handles the first 2 cases only.
		MachineInstr *processInstrForSlow3OpLEA(MachineInstr &MI,
		MachineFunction::iterator MFI);

/// \brief Look for LEAs that add 1 to reg or subtract 1 from reg		/// \brief Look for LEAs that add 1 to reg or subtract 1 from reg
/// and convert them to INC or DEC respectively.		/// and convert them to INC or DEC respectively.
bool fixupIncDec(MachineBasicBlock::iterator &I,		bool fixupIncDec(MachineBasicBlock::iterator &I,
MachineFunction::iterator MFI) const;		MachineFunction::iterator MFI) const;

/// \brief Determine if an instruction references a machine register		/// \brief Determine if an instruction references a machine register
/// and, if so, whether it reads or writes the register.		/// and, if so, whether it reads or writes the register.
RegUsageState usesRegister(MachineOperand &p, MachineBasicBlock::iterator I);		RegUsageState usesRegister(MachineOperand &p, MachineBasicBlock::iterator I);

/// \brief Step backwards through a basic block, looking		/// \brief Step backwards through a basic block, looking
/// for an instruction which writes a register within		/// for an instruction which writes a register within
/// a maximum of INSTR_DISTANCE_THRESHOLD instruction latency cycles.		/// a maximum of INSTR_DISTANCE_THRESHOLD instruction latency cycles.
MachineBasicBlock::iterator searchBackwards(MachineOperand &p,		MachineBasicBlock::iterator searchBackwards(MachineOperand &p,
MachineBasicBlock::iterator &I,		MachineBasicBlock::iterator &I,
MachineFunction::iterator MFI);		MachineFunction::iterator MFI);

/// \brief if an instruction can be converted to an		/// \brief if an instruction can be converted to an
/// equivalent LEA, insert the new instruction into the basic block		/// equivalent LEA, insert the new instruction into the basic block
/// and return a pointer to it. Otherwise, return zero.		/// and return a pointer to it. Otherwise, return zero.
MachineInstr *postRAConvertToLEA(MachineFunction::iterator &MFI,		MachineInstr *postRAConvertToLEA(MachineFunction::iterator &MFI,
MachineBasicBlock::iterator &MBBI) const;		MachineBasicBlock::iterator &MBBI) const;

public:		public:
FixupLEAPass() : MachineFunctionPass(ID) {}		static char ID;

		StringRef getPassName() const override { return FIXUPLEA_DESC; }

		FixupLEAPass() : MachineFunctionPass(ID) {
		initializeFixupLEAPassPass(*PassRegistry::getPassRegistry());
		}

/// \brief Loop over all of the basic blocks,		/// \brief Loop over all of the basic blocks,
/// replacing instructions by equivalent LEA instructions		/// replacing instructions by equivalent LEA instructions
/// if needed and when possible.		/// if needed and when possible.
bool runOnMachineFunction(MachineFunction &MF) override;		bool runOnMachineFunction(MachineFunction &MF) override;

// This pass runs after regalloc and doesn't support VReg operands.		// This pass runs after regalloc and doesn't support VReg operands.
MachineFunctionProperties getRequiredProperties() const override {		MachineFunctionProperties getRequiredProperties() const override {
return MachineFunctionProperties().set(		return MachineFunctionProperties().set(
MachineFunctionProperties::Property::NoVRegs);		MachineFunctionProperties::Property::NoVRegs);
}		}

private:		private:
MachineFunction *MF;		MachineFunction *MF;
const X86InstrInfo *TII; // Machine instruction info.		const X86InstrInfo *TII; // Machine instruction info.
bool OptIncDec;		bool OptIncDec;
bool OptLEA;		bool OptLEA;
};		};
char FixupLEAPass::ID = 0;
}		}

		char FixupLEAPass::ID = 0;

		INITIALIZE_PASS(FixupLEAPass, FIXUPLEA_NAME, FIXUPLEA_DESC, false, false)

MachineInstr *		MachineInstr *
FixupLEAPass::postRAConvertToLEA(MachineFunction::iterator &MFI,		FixupLEAPass::postRAConvertToLEA(MachineFunction::iterator &MFI,
MachineBasicBlock::iterator &MBBI) const {		MachineBasicBlock::iterator &MBBI) const {
MachineInstr &MI = *MBBI;		MachineInstr &MI = *MBBI;
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
case X86::MOV32rr:		case X86::MOV32rr:
case X86::MOV64rr: {		case X86::MOV64rr: {
const MachineOperand &Src = MI.getOperand(1);		const MachineOperand &Src = MI.getOperand(1);
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines

bool FixupLEAPass::runOnMachineFunction(MachineFunction &Func) {		bool FixupLEAPass::runOnMachineFunction(MachineFunction &Func) {
if (skipFunction(*Func.getFunction()))		if (skipFunction(*Func.getFunction()))
return false;		return false;

MF = &Func;		MF = &Func;
const X86Subtarget &ST = Func.getSubtarget<X86Subtarget>();		const X86Subtarget &ST = Func.getSubtarget<X86Subtarget>();
OptIncDec = !ST.slowIncDec() \|\| Func.getFunction()->optForMinSize();		OptIncDec = !ST.slowIncDec() \|\| Func.getFunction()->optForMinSize();
OptLEA = ST.LEAusesAG() \|\| ST.slowLEA();		OptLEA = ST.LEAusesAG() \|\| ST.slowLEA() \|\| ST.slow3OpsLEA();

if (!OptLEA && !OptIncDec)		if (!OptLEA && !OptIncDec)
return false;		return false;

TII = ST.getInstrInfo();		TII = ST.getInstrInfo();

DEBUG(dbgs() << "Start X86FixupLEAs\n";);		DEBUG(dbgs() << "Start X86FixupLEAs\n";);
// Process all basic blocks.		// Process all basic blocks.
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	while (Found && I != CurInst) {
}		}
InstrDistance += TII->getInstrLatency(		InstrDistance += TII->getInstrLatency(
MF->getSubtarget().getInstrItineraryData(), *CurInst);		MF->getSubtarget().getInstrItineraryData(), *CurInst);
Found = getPreviousInstr(CurInst, MFI);		Found = getPreviousInstr(CurInst, MFI);
}		}
return MachineBasicBlock::iterator();		return MachineBasicBlock::iterator();
}		}

static inline bool isLEA(const int opcode) {		static inline bool isLEA(const int Opcode) {
return opcode == X86::LEA16r \|\| opcode == X86::LEA32r \|\|		return Opcode == X86::LEA16r \|\| Opcode == X86::LEA32r \|\|
opcode == X86::LEA64r \|\| opcode == X86::LEA64_32r;		Opcode == X86::LEA64r \|\| Opcode == X86::LEA64_32r;
		}

		static inline bool isInefficientLEAReg(unsigned int Reg) {
		return Reg == X86::EBP \|\| Reg == X86::RBP \|\| Reg == X86::R13;
		}

		static inline bool isRegOperand(const MachineOperand &Op) {
		return Op.isReg() && Op.getReg() != X86::NoRegister;
		}
		/// hasIneffecientLEARegs - LEA that uses base and index registers
		/// where the base is EBP, RBP, or R13
		static inline bool hasInefficientLEABaseReg(const MachineOperand &Base,
		const MachineOperand &Index) {
		return Base.isReg() && isInefficientLEAReg(Base.getReg()) &&
		isRegOperand(Index);
		}

		static inline bool hasLEAOffset(const MachineOperand &Offset) {
		return (Offset.isImm() && Offset.getImm() != 0) \|\| Offset.isGlobal();
		}

		// LEA instruction that has all three operands: offset, base and index
		static inline bool isThreeOperandsLEA(const MachineOperand &Base,
		const MachineOperand &Index,
		const MachineOperand &Offset) {
		return isRegOperand(Base) && isRegOperand(Index) && hasLEAOffset(Offset);
		}

		static inline int getADDrrFromLEA(int LEAOpcode) {
		switch (LEAOpcode) {
		default:
		llvm_unreachable("Unexpected LEA instruction");
		case X86::LEA16r:
		return X86::ADD16rr;
		case X86::LEA32r:
		return X86::ADD32rr;
		case X86::LEA64_32r:
		case X86::LEA64r:
		return X86::ADD64rr;
		}
		}

		static inline int getADDriFromLEA(int LEAOpcode, const MachineOperand &Offset) {
		bool IsInt8 = Offset.isImm() && isInt<8>(Offset.getImm());
		switch (LEAOpcode) {
		default:
		llvm_unreachable("Unexpected LEA instruction");
		case X86::LEA16r:
		return IsInt8 ? X86::ADD16ri8 : X86::ADD16ri;
		case X86::LEA32r:
		case X86::LEA64_32r:
		return IsInt8 ? X86::ADD32ri8 : X86::ADD32ri;
		case X86::LEA64r:
		return IsInt8 ? X86::ADD64ri8 : X86::ADD64ri32;
		}
}		}

/// isLEASimpleIncOrDec - Does this LEA have one these forms:		/// isLEASimpleIncOrDec - Does this LEA have one these forms:
/// lea %reg, 1(%reg)		/// lea %reg, 1(%reg)
/// lea %reg, -1(%reg)		/// lea %reg, -1(%reg)
static inline bool isLEASimpleIncOrDec(MachineInstr &LEA) {		static inline bool isLEASimpleIncOrDec(MachineInstr &LEA) {
unsigned SrcReg = LEA.getOperand(1 + X86::AddrBaseReg).getReg();		unsigned SrcReg = LEA.getOperand(1 + X86::AddrBaseReg).getReg();
unsigned DstReg = LEA.getOperand(0).getReg();		unsigned DstReg = LEA.getOperand(0).getReg();
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	if (NewMI) {
processInstruction(J, MFI);		processInstruction(J, MFI);
}		}
}		}
}		}

void FixupLEAPass::processInstructionForSLM(MachineBasicBlock::iterator &I,		void FixupLEAPass::processInstructionForSLM(MachineBasicBlock::iterator &I,
MachineFunction::iterator MFI) {		MachineFunction::iterator MFI) {
MachineInstr &MI = *I;		MachineInstr &MI = *I;
const int opcode = MI.getOpcode();		const int Opcode = MI.getOpcode();
if (!isLEA(opcode))		if (!isLEA(Opcode))
return;		return;
if (MI.getOperand(5).getReg() != 0 \|\| !MI.getOperand(4).isImm() \|\|		if (MI.getOperand(5).getReg() != 0 \|\| !MI.getOperand(4).isImm() \|\|
!TII->isSafeToClobberEFLAGS(*MFI, I))		!TII->isSafeToClobberEFLAGS(*MFI, I))
return;		return;
const unsigned DstR = MI.getOperand(0).getReg();		const unsigned DstR = MI.getOperand(0).getReg();
const unsigned SrcR1 = MI.getOperand(1).getReg();		const unsigned SrcR1 = MI.getOperand(1).getReg();
const unsigned SrcR2 = MI.getOperand(3).getReg();		const unsigned SrcR2 = MI.getOperand(3).getReg();
if ((SrcR1 == 0 \|\| SrcR1 != DstR) && (SrcR2 == 0 \|\| SrcR2 != DstR))		if ((SrcR1 == 0 \|\| SrcR1 != DstR) && (SrcR2 == 0 \|\| SrcR2 != DstR))
return;		return;
if (MI.getOperand(2).getImm() > 1)		if (MI.getOperand(2).getImm() > 1)
return;		return;
int addrr_opcode, addri_opcode;
switch (opcode) {
default:
llvm_unreachable("Unexpected LEA instruction");
case X86::LEA16r:
addrr_opcode = X86::ADD16rr;
addri_opcode = X86::ADD16ri;
break;
case X86::LEA32r:
addrr_opcode = X86::ADD32rr;
addri_opcode = X86::ADD32ri;
break;
case X86::LEA64_32r:
case X86::LEA64r:
addrr_opcode = X86::ADD64rr;
addri_opcode = X86::ADD64ri32;
break;
}
DEBUG(dbgs() << "FixLEA: Candidate to replace:"; I->dump(););		DEBUG(dbgs() << "FixLEA: Candidate to replace:"; I->dump(););
DEBUG(dbgs() << "FixLEA: Replaced by: ";);		DEBUG(dbgs() << "FixLEA: Replaced by: ";);
MachineInstr *NewMI = nullptr;		MachineInstr *NewMI = nullptr;
const MachineOperand &Dst = MI.getOperand(0);
// Make ADD instruction for two registers writing to LEA's destination		// Make ADD instruction for two registers writing to LEA's destination
if (SrcR1 != 0 && SrcR2 != 0) {		if (SrcR1 != 0 && SrcR2 != 0) {
const MachineOperand &Src1 = MI.getOperand(SrcR1 == DstR ? 1 : 3);		const MCInstrDesc &ADDrr = TII->get(getADDrrFromLEA(Opcode));
const MachineOperand &Src2 = MI.getOperand(SrcR1 == DstR ? 3 : 1);		const MachineOperand &Src = MI.getOperand(SrcR1 == DstR ? 3 : 1);
NewMI = BuildMI(*MF, MI.getDebugLoc(), TII->get(addrr_opcode))		NewMI =
.add(Dst)		BuildMI(*MFI, I, MI.getDebugLoc(), ADDrr, DstR).addReg(DstR).add(Src);
.add(Src1)
.add(Src2);
MFI->insert(I, NewMI);
DEBUG(NewMI->dump(););		DEBUG(NewMI->dump(););
}		}
// Make ADD instruction for immediate		// Make ADD instruction for immediate
if (MI.getOperand(4).getImm() != 0) {		if (MI.getOperand(4).getImm() != 0) {
		const MCInstrDesc &ADDri =
		TII->get(getADDriFromLEA(Opcode, MI.getOperand(4)));
const MachineOperand &SrcR = MI.getOperand(SrcR1 == DstR ? 1 : 3);		const MachineOperand &SrcR = MI.getOperand(SrcR1 == DstR ? 1 : 3);
NewMI = BuildMI(*MF, MI.getDebugLoc(), TII->get(addri_opcode))		NewMI = BuildMI(*MFI, I, MI.getDebugLoc(), ADDri, DstR)
.add(Dst)
.add(SrcR)		.add(SrcR)
.addImm(MI.getOperand(4).getImm());		.addImm(MI.getOperand(4).getImm());
MFI->insert(I, NewMI);
DEBUG(NewMI->dump(););		DEBUG(NewMI->dump(););
}		}
if (NewMI) {		if (NewMI) {
MFI->erase(I);		MFI->erase(I);
I = static_cast<MachineBasicBlock::iterator>(NewMI);		I = NewMI;
		}
		}

		MachineInstr *
		FixupLEAPass::processInstrForSlow3OpLEA(MachineInstr &MI,
		MachineFunction::iterator MFI) {

		const int LEAOpcode = MI.getOpcode();
		if (!isLEA(LEAOpcode))
		return nullptr;

		const MachineOperand &Dst = MI.getOperand(0);
		const MachineOperand &Base = MI.getOperand(1);
		const MachineOperand &Scale = MI.getOperand(2);
		const MachineOperand &Index = MI.getOperand(3);
		const MachineOperand &Offset = MI.getOperand(4);
		const MachineOperand &Segment = MI.getOperand(5);

		if (!(isThreeOperandsLEA(Base, Index, Offset) \|\|
		hasInefficientLEABaseReg(Base, Index)) \|\|
		!TII->isSafeToClobberEFLAGS(*MFI, MI) \|\|
		Segment.getReg() != X86::NoRegister)
		return nullptr;

		unsigned int DstR = Dst.getReg();
		unsigned int BaseR = Base.getReg();
		unsigned int IndexR = Index.getReg();
		unsigned SSDstR =
		(LEAOpcode == X86::LEA64_32r) ? getX86SubSuperRegister(DstR, 64) : DstR;
		bool IsScale1 = Scale.getImm() == 1;
		bool IsInefficientBase = isInefficientLEAReg(BaseR);
		bool IsInefficientIndex = isInefficientLEAReg(IndexR);

		// Skip these cases since it takes more than 2 instructions
		// to replace the LEA instruction.
		if (IsInefficientBase && SSDstR == BaseR && !IsScale1)
		return nullptr;
		if (LEAOpcode == X86::LEA64_32r && IsInefficientBase &&
		(IsInefficientIndex \|\| !IsScale1))
		return nullptr;

		const DebugLoc DL = MI.getDebugLoc();
		const MCInstrDesc &ADDrr = TII->get(getADDrrFromLEA(LEAOpcode));
		const MCInstrDesc &ADDri = TII->get(getADDriFromLEA(LEAOpcode, Offset));

		DEBUG(dbgs() << "FixLEA: Candidate to replace:"; MI.dump(););
		DEBUG(dbgs() << "FixLEA: Replaced by: ";);

		// First try to replace LEA with one or two (for the 3-op LEA case)
		// add instructions:
		// 1.lea (%base,%index,1), %base => add %index,%base
		// 2.lea (%base,%index,1), %index => add %base,%index
		if (IsScale1 && (DstR == BaseR \|\| DstR == IndexR)) {
		const MachineOperand &Src = DstR == BaseR ? Index : Base;
		MachineInstr *NewMI =
		BuildMI(*MFI, MI, DL, ADDrr, DstR).addReg(DstR).add(Src);
		DEBUG(NewMI->dump(););
		// Create ADD instruction for the Offset in case of 3-Ops LEA.
		if (hasLEAOffset(Offset)) {
		NewMI = BuildMI(*MFI, MI, DL, ADDri, DstR).addReg(DstR).add(Offset);
		DEBUG(NewMI->dump(););
		}
		return NewMI;
		}
		// If the base is inefficient try switching the index and base operands,
		// otherwise just break the 3-Ops LEA inst into 2-Ops LEA + ADD instruction:
		// lea offset(%base,%index,scale),%dst =>
		// lea (%base,%index,scale); add offset,%dst
		if (!IsInefficientBase \|\| (!IsInefficientIndex && IsScale1)) {
		MachineInstr NewMI = BuildMI(MFI, MI, DL, TII->get(LEAOpcode))
		.add(Dst)
		.add(IsInefficientBase ? Index : Base)
		.add(Scale)
		.add(IsInefficientBase ? Base : Index)
		.addImm(0)
		.add(Segment);
		DEBUG(NewMI->dump(););
		// Create ADD instruction for the Offset in case of 3-Ops LEA.
		if (hasLEAOffset(Offset)) {
		NewMI = BuildMI(*MFI, MI, DL, ADDri, DstR).addReg(DstR).add(Offset);
		DEBUG(NewMI->dump(););
		}
		return NewMI;
		}
		// Handle the rest of the cases with inefficient base register:
		assert(SSDstR != BaseR && "SSDstR == BaseR should be handled already!");
		assert(IsInefficientBase && "efficient base should be handled already!");

		// lea (%base,%index,1), %dst => mov %base,%dst; add %index,%dst
		if (IsScale1 && !hasLEAOffset(Offset)) {
		TII->copyPhysReg(*MFI, MI, DL, DstR, BaseR, Base.isKill());
		DEBUG(MI.getPrevNode()->dump(););

		MachineInstr *NewMI =
		BuildMI(*MFI, MI, DL, ADDrr, DstR).addReg(DstR).add(Index);
		DEBUG(NewMI->dump(););
		return NewMI;
}		}
		// lea offset(%base,%index,scale), %dst =>
		// lea offset( ,%index,scale), %dst; add %base,%dst
		MachineInstr NewMI = BuildMI(MFI, MI, DL, TII->get(LEAOpcode))
		.add(Dst)
		.addReg(0)
		.add(Scale)
		.add(Index)
		.add(Offset)
		.add(Segment);
		DEBUG(NewMI->dump(););

		NewMI = BuildMI(*MFI, MI, DL, ADDrr, DstR).addReg(DstR).add(Base);
		DEBUG(NewMI->dump(););
		return NewMI;
}		}

bool FixupLEAPass::processBasicBlock(MachineFunction &MF,		bool FixupLEAPass::processBasicBlock(MachineFunction &MF,
MachineFunction::iterator MFI) {		MachineFunction::iterator MFI) {

for (MachineBasicBlock::iterator I = MFI->begin(); I != MFI->end(); ++I) {		for (MachineBasicBlock::iterator I = MFI->begin(); I != MFI->end(); ++I) {
if (OptIncDec)		if (OptIncDec)
if (fixupIncDec(I, MFI))		if (fixupIncDec(I, MFI))
continue;		continue;

if (OptLEA) {		if (OptLEA) {
if (MF.getSubtarget<X86Subtarget>().isSLM())		if (MF.getSubtarget<X86Subtarget>().isSLM())
processInstructionForSLM(I, MFI);		processInstructionForSLM(I, MFI);
else
		else {
		if (MF.getSubtarget<X86Subtarget>().slow3OpsLEA()) {
		if (auto NewMI = processInstrForSlow3OpLEA(I, MFI)) {
		MFI->erase(I);
		I = NewMI;
		}
		} else
processInstruction(I, MFI);		processInstruction(I, MFI);
}		}
}		}
		}
return false;		return false;
}		}

llvm/trunk/lib/Target/X86/X86Subtarget.h

Show First 20 Lines • Show All 247 Lines • ▼ Show 20 Lines	protected:

/// True if the LEA instruction inputs have to be ready at address generation		/// True if the LEA instruction inputs have to be ready at address generation
/// (AG) time.		/// (AG) time.
bool LEAUsesAG;		bool LEAUsesAG;

/// True if the LEA instruction with certain arguments is slow		/// True if the LEA instruction with certain arguments is slow
bool SlowLEA;		bool SlowLEA;

		/// True if the LEA instruction has all three source operands: base, index,
		/// and offset or if the LEA instruction uses base and index registers where
		/// the base is EBP, RBP,or R13
		bool Slow3OpsLEA;

/// True if INC and DEC instructions are slow when writing to flags		/// True if INC and DEC instructions are slow when writing to flags
bool SlowIncDec;		bool SlowIncDec;

/// Processor has AVX-512 PreFetch Instructions		/// Processor has AVX-512 PreFetch Instructions
bool HasPFI;		bool HasPFI;

/// Processor has AVX-512 Exponential and Reciprocal Instructions		/// Processor has AVX-512 Exponential and Reciprocal Instructions
bool HasERI;		bool HasERI;
▲ Show 20 Lines • Show All 221 Lines • ▼ Show 20 Lines	public:
bool hasFastSHLDRotate() const { return HasFastSHLDRotate; }		bool hasFastSHLDRotate() const { return HasFastSHLDRotate; }
bool hasERMSB() const { return HasERMSB; }		bool hasERMSB() const { return HasERMSB; }
bool hasSlowDivide32() const { return HasSlowDivide32; }		bool hasSlowDivide32() const { return HasSlowDivide32; }
bool hasSlowDivide64() const { return HasSlowDivide64; }		bool hasSlowDivide64() const { return HasSlowDivide64; }
bool padShortFunctions() const { return PadShortFunctions; }		bool padShortFunctions() const { return PadShortFunctions; }
bool callRegIndirect() const { return CallRegIndirect; }		bool callRegIndirect() const { return CallRegIndirect; }
bool LEAusesAG() const { return LEAUsesAG; }		bool LEAusesAG() const { return LEAUsesAG; }
bool slowLEA() const { return SlowLEA; }		bool slowLEA() const { return SlowLEA; }
		bool slow3OpsLEA() const { return Slow3OpsLEA; }
bool slowIncDec() const { return SlowIncDec; }		bool slowIncDec() const { return SlowIncDec; }
bool hasCDI() const { return HasCDI; }		bool hasCDI() const { return HasCDI; }
bool hasPFI() const { return HasPFI; }		bool hasPFI() const { return HasPFI; }
bool hasERI() const { return HasERI; }		bool hasERI() const { return HasERI; }
bool hasDQI() const { return HasDQI; }		bool hasDQI() const { return HasDQI; }
bool hasBWI() const { return HasBWI; }		bool hasBWI() const { return HasBWI; }
bool hasVLX() const { return HasVLX; }		bool hasVLX() const { return HasVLX; }
bool hasPKU() const { return HasPKU; }		bool hasPKU() const { return HasPKU; }
▲ Show 20 Lines • Show All 159 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86TargetMachine.cpp

	Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines

	static cl::opt<bool> EnableMachineCombinerPass("x86-machine-combiner",			static cl::opt<bool> EnableMachineCombinerPass("x86-machine-combiner",
	cl::desc("Enable the machine combiner pass"),			cl::desc("Enable the machine combiner pass"),
	cl::init(true), cl::Hidden);			cl::init(true), cl::Hidden);

	namespace llvm {			namespace llvm {

	void initializeWinEHStatePassPass(PassRegistry &);			void initializeWinEHStatePassPass(PassRegistry &);
				void initializeFixupLEAPassPass(PassRegistry &);
	void initializeX86ExecutionDepsFixPass(PassRegistry &);			void initializeX86ExecutionDepsFixPass(PassRegistry &);

	} // end namespace llvm			} // end namespace llvm

	extern "C" void LLVMInitializeX86Target() {			extern "C" void LLVMInitializeX86Target() {
	// Register the target.			// Register the target.
	RegisterTargetMachine<X86TargetMachine> X(getTheX86_32Target());			RegisterTargetMachine<X86TargetMachine> X(getTheX86_32Target());
	RegisterTargetMachine<X86TargetMachine> Y(getTheX86_64Target());			RegisterTargetMachine<X86TargetMachine> Y(getTheX86_64Target());

	PassRegistry &PR = *PassRegistry::getPassRegistry();			PassRegistry &PR = *PassRegistry::getPassRegistry();
	initializeGlobalISel(PR);			initializeGlobalISel(PR);
	initializeWinEHStatePassPass(PR);			initializeWinEHStatePassPass(PR);
	initializeFixupBWInstPassPass(PR);			initializeFixupBWInstPassPass(PR);
	initializeEvexToVexInstPassPass(PR);			initializeEvexToVexInstPassPass(PR);
				initializeFixupLEAPassPass(PR);
	initializeX86ExecutionDepsFixPass(PR);			initializeX86ExecutionDepsFixPass(PR);
	}			}

	static std::unique_ptr<TargetLoweringObjectFile> createTLOF(const Triple &TT) {			static std::unique_ptr<TargetLoweringObjectFile> createTLOF(const Triple &TT) {
	if (TT.isOSBinFormatMachO()) {			if (TT.isOSBinFormatMachO()) {
	if (TT.getArch() == Triple::x86_64)			if (TT.getArch() == Triple::x86_64)
	return llvm::make_unique<X86_64MachoTargetObjectFile>();			return llvm::make_unique<X86_64MachoTargetObjectFile>();
	return llvm::make_unique<TargetLoweringObjectFileMachO>();			return llvm::make_unique<TargetLoweringObjectFileMachO>();
	▲ Show 20 Lines • Show All 383 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/leaFixup32.mir

				# RUN: llc -run-pass x86-fixup-LEAs -mcpu=corei7-avx -o - %s \| FileCheck %s
				--- \|
				; ModuleID = 'test/CodeGen/X86/fixup-lea.ll'
				source_filename = "test/CodeGen/X86/fixup-lea.ll"
				target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
				target triple = "i386"
				;generated using: llc -stop-after x86-pad-short-functions fixup-lea.ll > leaFinxup32.mir

				;test2add_32: 3 operands LEA32r that can be replaced with 2 add instructions
				; where ADD32ri8 is chosen
				define i32 @test2add_32() {
				ret i32 0
				}

				;test2add_ebp_32: 3 operands LEA32r that can be replaced with 2 add instructions
				; where the base is rbp/r13/ebp register
				define i32 @test2add_ebp_32() {
				ret i32 0
				}

				;test1add_ebp_32: 2 operands LEA32r where base register is ebp and can be replaced
				; with an add instruction
				define i32 @test1add_ebp_32() {
				ret i32 0
				}

				;testleaadd_32: 3 operands LEA32r that can be replaced with 1 lea 1 add instructions
				define i32 @testleaadd_32() {
				ret i32 0
				}

				;testleaadd_ebp_32: 3 operands LEA32r that can be replaced with 1 lea 1 add instructions
				; where the base is ebp register
				define i32 @testleaadd_ebp_32() {
				ret i32 0
				}

				;test1lea_ebp_32: 2 operands LEA32r wher base register is rbp/r13/ebp and can be replaced
				; with a lea instruction
				define i32 @test1lea_ebp_32() {
				ret i32 0
				}

				;test2addi32_32: 3 operands LEA32r that can be replaced with 2 add instructions where ADD32ri32
				; is chosen
				define i32 @test2addi32_32() {
				ret i32 0
				}

				;test1mov1add_ebp_32: 2 operands LEA32r that can be replaced with 1 add 1 mov instructions
				; where the base is rbp/r13/ebp register
				define i32 @test1mov1add_ebp_32() {
				ret i32 0
				}

				;testleaadd_ebp_index_32: 3 operands LEA32r that can be replaced with 1 lea 1 add instructions
				; where the base and the index are ebp register and there is offset
				define i32 @testleaadd_ebp_index_32() {
				ret i32 0
				}

				;testleaadd_ebp_index2_32: 3 operands LEA32r that can be replaced with 1 lea 1 add instructions
				; where the base and the index are ebp register and there is scale
				define i32 @testleaadd_ebp_index2_32() {
				ret i32 0
				}

				;test_skip_opt_32: 3 operands LEA32r that can not be replaced with 2 instructions
				define i32 @test_skip_opt_32() {
				ret i32 0
				}

				;test_skip_eflags_32: LEA32r that cannot be replaced since its not safe to clobber eflags
				define i32 @test_skip_eflags_32() {
				ret i32 0
				}

				...
				---
				name: test2add_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%eax' }
				- { reg: '%ebp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp
				; CHECK: %eax = ADD32rr %eax, killed %ebp
				; CHECK: %eax = ADD32ri8 %eax, -5

				%eax = LEA32r killed %eax, 1, killed %ebp, -5, _
				RETQ %eax

				...
				---
				name: test2add_ebp_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%eax' }
				- { reg: '%ebp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp
				; CHECK: %ebp = ADD32rr %ebp, killed %eax
				; CHECK: %ebp = ADD32ri8 %ebp, -5

				%ebp = LEA32r killed %ebp, 1, killed %eax, -5, _
				RETQ %ebp

				...
				---
				name: test1add_ebp_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%eax' }
				- { reg: '%ebp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp
				; CHECK: %ebp = ADD32rr %ebp, killed %eax

				%ebp = LEA32r killed %ebp, 1, killed %eax, 0, _
				RETQ %ebp

				...
				---
				name: testleaadd_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%eax' }
				- { reg: '%ebp' }
				- { reg: '%ebx' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp, %esi
				; CHECK: %ebx = LEA32r killed %eax, 1, killed %ebp, 0
				; CHECK: %ebx = ADD32ri8 %ebx, -5

				%ebx = LEA32r killed %eax, 1, killed %ebp, -5, _
				RETQ %ebx

				...
				---
				name: testleaadd_ebp_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%eax' }
				- { reg: '%ebp' }
				- { reg: '%ebx' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp
				; CHECK: %ebx = LEA32r killed %eax, 1, killed %ebp, 0, _
				; CHECK: %ebx = ADD32ri8 %ebx, -5

				%ebx = LEA32r killed %ebp, 1, killed %eax, -5, _
				RETQ %ebx

				...
				---
				name: test1lea_ebp_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%eax' }
				- { reg: '%ebp' }
				- { reg: '%ebx' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp
				; CHECK: %ebx = LEA32r killed %eax, 1, killed %ebp, 0, _

				%ebx = LEA32r killed %ebp, 1, killed %eax, 0, _
				RETQ %ebx

				...
				---
				name: test2addi32_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%eax' }
				- { reg: '%ebp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp
				; CHECK: %eax = ADD32rr %eax, killed %ebp
				; CHECK: %eax = ADD32ri %eax, 129

				%eax = LEA32r killed %eax, 1, killed %ebp, 129, _
				RETQ %eax

				...
				---
				name: test1mov1add_ebp_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%eax' }
				- { reg: '%ebp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp, %ebx
				; CHECK: %ebx = MOV32rr killed %ebp
				; CHECK: %ebx = ADD32rr %ebx, killed %ebp

				%ebx = LEA32r killed %ebp, 1, killed %ebp, 0, _
				RETQ %ebx

				...
				---
				name: testleaadd_ebp_index_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%ebx' }
				- { reg: '%ebp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp, %ebx
				; CHECK: %ebx = LEA32r _, 1, killed %ebp, 5, _
				; CHECK: %ebx = ADD32rr %ebx, killed %ebp

				%ebx = LEA32r killed %ebp, 1, killed %ebp, 5, _
				RETQ %ebx

				...
				---
				name: testleaadd_ebp_index2_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%ebx' }
				- { reg: '%ebp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp, %ebx
				; CHECK: %ebx = LEA32r _, 4, killed %ebp, 5, _
				; CHECK: %ebx = ADD32rr %ebx, killed %ebp

				%ebx = LEA32r killed %ebp, 4, killed %ebp, 5, _
				RETQ %ebx

				...
				---
				name: test_skip_opt_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%ebx' }
				- { reg: '%ebp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp, %ebx
				; CHECK: %ebp = LEA32r killed %ebp, 4, killed %ebp, 0, _

				%ebp = LEA32r killed %ebp, 4, killed %ebp, 0, _
				RETQ %ebp

				...
				---
				name: test_skip_eflags_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%ebp' }
				- { reg: '%eax' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp, %ebx
				; CHECK: %ebx = LEA32r killed %eax, 4, killed %eax, 5, _
				; CHECK: %ebp = LEA32r killed %ebx, 4, killed %ebx, 0, _
				; CHECK: %ebp = ADD32ri8 %ebp, 5

				CMP32rr %eax, killed %ebx, implicit-def %eflags
				%ebx = LEA32r killed %eax, 4, killed %eax, 5, _
				JE_1 %bb.1, implicit %eflags
				RETQ %ebx
				bb.1:
				liveins: %eax, %ebp, %ebx
				%ebp = LEA32r killed %ebx, 4, killed %ebx, 5, _
				RETQ %ebp

				...

llvm/trunk/test/CodeGen/X86/leaFixup64.mir

				# RUN: llc -run-pass x86-fixup-LEAs -mcpu=corei7-avx -o - %s \| FileCheck %s
				--- \|
				; ModuleID = 'lea-2.ll'
				source_filename = "lea-2.ll"
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				;generated using: llc -stop-after x86-pad-short-functions lea-2.ll > leaFinxup64.mir

				;testleaadd_64_32_1: 3 operands LEA64_32r cannot be replaced with 2 add instructions
				; but can be replaced with 1 lea + 1 add
				define i32 @testleaadd_64_32_1() {
				ret i32 0
				}

				;testleaadd_rbp_64_32_1: 3 operands LEA64_32r cannot be replaced with 2 add instructions
				; where the base is rbp/r13/ebp register but it can be replaced with 1 lea + 1 add
				define i32 @testleaadd_rbp_64_32_1() {
				ret i32 0
				}

				;test1lea_rbp_64_32_1: 2 operands LEA64_32r where base register is rbp/r13/ebp and can not
				; be replaced with an add instruction but can be replaced with 1 lea instruction
				define i32 @test1lea_rbp_64_32_1() {
				ret i32 0
				}

				;test2add_64: 3 operands LEA64r that can be replaced with 2 add instructions
				define i32 @test2add_64() {
				ret i32 0
				}

				;test2add_rbp_64: 3 operands LEA64r that can be replaced with 2 add instructions
				; where the base is rbp/r13/ebp register
				define i32 @test2add_rbp_64() {
				ret i32 0
				}

				;test1add_rbp_64: 2 operands LEA64r where base register is rbp/r13/ebp and can be replaced
				; with an add instruction
				define i32 @test1add_rbp_64() {
				ret i32 0
				}

				;testleaadd_64_32: 3 operands LEA64_32r that can be replaced with 1 lea 1 add instructions
				define i32 @testleaadd_64_32() {
				ret i32 0
				}

				;testleaadd_rbp_64_32: 3 operands LEA64_32r that can be replaced with 1 lea 1 add instructions
				; where the base is rbp/r13/ebp register
				define i32 @testleaadd_rbp_64_32() {
				ret i32 0
				}

				;test1lea_rbp_64_32: 2 operands LEA64_32r where base register is rbp/r13/ebp and can be replaced
				; with a lea instruction
				define i32 @test1lea_rbp_64_32() {
				ret i32 0
				}

				;testleaadd_64: 3 operands LEA64r that can be replaced with 1 lea 1 add instructions
				define i32 @testleaadd_64() {
				ret i32 0
				}

				;testleaadd_rbp_64: 3 operands LEA64r that can be replaced with 1 lea 1 add instructions
				; where the base is rbp/r13/ebp register
				define i32 @testleaadd_rbp_64() {
				ret i32 0
				}

				;test1lea_rbp_64: 2 operands LEA64r wher base register is rbp/r13/ebp and can be replaced
				; with a lea instruction
				define i32 @test1lea_rbp_64() {
				ret i32 0
				}

				;test8: dst = base & scale!=1, can't optimize
				define i32 @test8() {
				ret i32 0
				}

				;testleaaddi32_64_32: 3 operands LEA64_32r that can be replaced with 1 lea + 1 add instructions where
				; ADD64ri32 is chosen
				define i32 @testleaaddi32_64_32() {
				ret i32 0
				}

				;test1mov1add_rbp_64_32: 2 operands LEA64_32r cannot be replaced with 1 add 1 mov instructions
				; where the base is rbp/r13/ebp register
				define i32 @test1mov1add_rbp_64_32() {
				ret i32 0
				}

				;testleaadd_rbp_index_64_32: 3 operands LEA64_32r that cannot replaced with 1 lea 1 add instructions
				; where the base and the index are ebp register and there is offset
				define i32 @testleaadd_rbp_index_64_32() {
				ret i32 0
				}

				;testleaadd_rbp_index2_64_32: 3 operands LEA64_32r that cannot replaced with 1 lea 1 add instructions
				; where the base and the index are ebp register and there is scale
				define i32 @testleaadd_rbp_index2_64_32() {
				ret i32 0
				}

				;test2addi32_64: 3 operands LEA64r that can be replaced with 2 add instructions where ADD64ri32
				; is chosen
				define i32 @test2addi32_64() {
				ret i32 0
				}

				;test1mov1add_rbp_64: 2 operands LEA64r that can be replaced with 1 add 1 mov instructions
				; where the base is rbp/r13/ebp register
				define i32 @test1mov1add_rbp_64() {
				ret i32 0
				}

				;testleaadd_rbp_index_64: 3 operands LEA64r that can be replaced with 1 lea 1 add instructions
				; where the base and the index are ebp register and there is offset
				define i32 @testleaadd_rbp_index_64() {
				ret i32 0
				}

				;testleaadd_rbp_index2_64: 3 operands LEA64r that can be replaced with 1 lea 1 add instructions
				; where the base and the index are ebp register and there is scale
				define i32 @testleaadd_rbp_index2_64() {
				ret i32 0
				}

				;test_skip_opt_64: 3 operands LEA64r that can not be replaced with 2 instructions
				define i32 @test_skip_opt_64() {
				ret i32 0
				}

				;test_skip_eflags_64: LEA64r that cannot be replaced since its not safe to clobber eflags
				define i32 @test_skip_eflags_64() {
				ret i32 0
				}

				;test_skip_opt_64_32: 3 operands LEA64_32r that can not be replaced with 2 instructions
				define i32 @test_skip_opt_64_32() {
				ret i32 0
				}

				;test_skip_eflags_64_32: LEA64_32r that cannot be replaced since its not safe to clobber eflags
				define i32 @test_skip_eflags_64_32() {
				ret i32 0
				}


				...
				---
				name: testleaadd_64_32_1
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %eax = LEA64_32r killed %rax, 1, killed %rbp, 0
				; CHECK: %eax = ADD32ri8 %eax, -5

				%eax = LEA64_32r killed %rax, 1, killed %rbp, -5, _
				RETQ %eax

				...
				---
				name: testleaadd_rbp_64_32_1
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %ebp = LEA64_32r killed %rax, 1, killed %rbp, 0
				; CHECK: %ebp = ADD32ri8 %ebp, -5

				%ebp = LEA64_32r killed %rbp, 1, killed %rax, -5, _
				RETQ %ebp

				...
				---
				name: test1lea_rbp_64_32_1
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %ebp = LEA64_32r killed %rax, 1, killed %rbp, 0

				%ebp = LEA64_32r killed %rbp, 1, killed %rax, 0, _
				RETQ %ebp

				...
				---
				name: test2add_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %rax = ADD64rr %rax, killed %rbp
				; CHECK: %rax = ADD64ri8 %rax, -5

				%rax = LEA64r killed %rax, 1, killed %rbp, -5, _
				RETQ %eax

				...
				---
				name: test2add_rbp_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %rbp = ADD64rr %rbp, killed %rax
				; CHECK: %rbp = ADD64ri8 %rbp, -5

				%rbp = LEA64r killed %rbp, 1, killed %rax, -5, _
				RETQ %ebp

				...
				---
				name: test1add_rbp_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %rbp = ADD64rr %rbp, killed %rax

				%rbp = LEA64r killed %rbp, 1, killed %rax, 0, _
				RETQ %ebp

				...
				---
				name: testleaadd_64_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				- { reg: '%rbx' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %ebx = LEA64_32r killed %rax, 1, killed %rbp, 0, _
				; CHECK: %ebx = ADD32ri8 %ebx, -5

				%ebx = LEA64_32r killed %rax, 1, killed %rbp, -5, _
				RETQ %ebx

				...
				---
				name: testleaadd_rbp_64_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				- { reg: '%rbx' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %ebx = LEA64_32r killed %rax, 1, killed %rbp, 0, _
				; CHECK: %ebx = ADD32ri8 %ebx, -5

				%ebx = LEA64_32r killed %rbp, 1, killed %rax, -5, _
				RETQ %ebx

				...
				---
				name: test1lea_rbp_64_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				- { reg: '%rbx' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %ebx = LEA64_32r killed %rax, 1, killed %rbp, 0, _

				%ebx = LEA64_32r killed %rbp, 1, killed %rax, 0, _
				RETQ %ebx

				...
				---
				name: testleaadd_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				- { reg: '%rbx' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %rbx = LEA64r killed %rax, 1, killed %rbp, 0, _
				; CHECK: %rbx = ADD64ri8 %rbx, -5

				%rbx = LEA64r killed %rax, 1, killed %rbp, -5, _
				RETQ %ebx

				...
				---
				name: testleaadd_rbp_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				- { reg: '%rbx' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %rbx = LEA64r killed %rax, 1, killed %rbp, 0, _
				; CHECK: %rbx = ADD64ri8 %rbx, -5

				%rbx = LEA64r killed %rbp, 1, killed %rax, -5, _
				RETQ %ebx

				...
				---
				name: test1lea_rbp_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				- { reg: '%rbx' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %rbx = LEA64r killed %rax, 1, killed %rbp, 0, _

				%rbx = LEA64r killed %rbp, 1, killed %rax, 0, _
				RETQ %ebx

				...
				---
				name: test8
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rdi' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rdi, %rbp
				; CHECK: %r12 = LEA64r _, 2, killed %r13, 5, _
				; CHECK: %r12 = ADD64rr %r12, killed %rbp
				%rbp = KILL %rbp, implicit-def %rbp
				%r13 = KILL %rdi, implicit-def %r13
				%r12 = LEA64r killed %rbp, 2, killed %r13, 5, _
				RETQ %r12

				...
				---
				name: testleaaddi32_64_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %eax = LEA64_32r killed %rax, 1, killed %rbp, 0
				; CHECK: %eax = ADD32ri %eax, 129

				%eax = LEA64_32r killed %rax, 1, killed %rbp, 129, _
				RETQ %eax

				...
				---
				name: test1mov1add_rbp_64_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp, %rbx
				; CHECK: %ebx = LEA64_32r killed %rbp, 1, killed %rbp, 0, _

				%ebx = LEA64_32r killed %rbp, 1, killed %rbp, 0, _
				RETQ %ebx

				...
				---
				name: testleaadd_rbp_index_64_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rbx' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp, %rbx
				; CHECK: %ebx = LEA64_32r killed %rbp, 1, killed %rbp, 5, _

				%ebx = LEA64_32r killed %rbp, 1, killed %rbp, 5, _
				RETQ %ebx

				...
				---
				name: testleaadd_rbp_index2_64_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rbx' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %eax, %ebp, %ebx
				; CHECK: %ebx = LEA64_32r killed %rbp, 4, killed %rbp, 5, _

				%ebx = LEA64_32r killed %rbp, 4, killed %rbp, 5, _
				RETQ %ebx

				...
				---
				name: test2addi32_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp
				; CHECK: %rax = ADD64rr %rax, killed %rbp
				; CHECK: %rax = ADD64ri32 %rax, 129

				%rax = LEA64r killed %rax, 1, killed %rbp, 129, _
				RETQ %eax

				...
				---
				name: test1mov1add_rbp_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rax' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp, %rbx
				; CHECK: %rbx = MOV64rr killed %rbp
				; CHECK: %rbx = ADD64rr %rbx, killed %rbp

				%rbx = LEA64r killed %rbp, 1, killed %rbp, 0, _
				RETQ %ebx

				...
				---
				name: testleaadd_rbp_index_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rbx' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp, %rbx
				; CHECK: %rbx = LEA64r _, 1, killed %rbp, 5, _
				; CHECK: %rbx = ADD64rr %rbx, killed %rbp

				%rbx = LEA64r killed %rbp, 1, killed %rbp, 5, _
				RETQ %ebx

				...
				---
				name: testleaadd_rbp_index2_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rbx' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp, %rbx
				; CHECK: %rbx = LEA64r _, 4, killed %rbp, 5, _
				; CHECK: %rbx = ADD64rr %rbx, killed %rbp

				%rbx = LEA64r killed %rbp, 4, killed %rbp, 5, _
				RETQ %ebx

				...
				---
				name: test_skip_opt_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rbx' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp, %rbx
				; CHECK: %rbp = LEA64r killed %rbp, 4, killed %rbp, 0, _

				%rbp = LEA64r killed %rbp, 4, killed %rbp, 0, _
				RETQ %ebp

				...
				---
				name: test_skip_eflags_64
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rbp' }
				- { reg: '%rax' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp, %rbx
				; CHECK: %rbx = LEA64r killed %rax, 4, killed %rax, 5, _
				; CHECK: %rbp = LEA64r killed %rbx, 4, killed %rbx, 0, _
				; CHECK: %rbp = ADD64ri8 %rbp, 5

				CMP64rr %rax, killed %rbx, implicit-def %eflags
				%rbx = LEA64r killed %rax, 4, killed %rax, 5, _
				JE_1 %bb.1, implicit %eflags
				RETQ %ebx
				bb.1:
				liveins: %rax, %rbp, %rbx
				%rbp = LEA64r killed %rbx, 4, killed %rbx, 5, _
				RETQ %ebp

				...
				---
				name: test_skip_opt_64_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rbx' }
				- { reg: '%rbp' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp, %rbx
				; CHECK: %ebp = LEA64_32r killed %rbp, 4, killed %rbp, 0, _

				%ebp = LEA64_32r killed %rbp, 4, killed %rbp, 0, _
				RETQ %ebp

				...
				---
				name: test_skip_eflags_64_32
				alignment: 4
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				liveins:
				- { reg: '%rbp' }
				- { reg: '%rax' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				body: \|
				bb.0 (%ir-block.0):
				liveins: %rax, %rbp, %rbx
				; CHECK: %ebx = LEA64_32r killed %rax, 4, killed %rax, 5, _
				; CHECK: %ebp = LEA64_32r killed %rbx, 4, killed %rbx, 0, _
				; CHECK: %ebp = ADD32ri8 %ebp, 5

				CMP64rr %rax, killed %rbx, implicit-def %eflags
				%ebx = LEA64_32r killed %rax, 4, killed %rax, 5, _
				JE_1 %bb.1, implicit %eflags
				RETQ %ebx
				bb.1:
				liveins: %rax, %rbp, %rbx
				%ebp = LEA64_32r killed %rbx, 4, killed %rbx, 5, _
				RETQ %ebp

				...