This is an archive of the discontinued LLVM Phabricator instance.

[X86] New pass that moves immediate operands to registers.
Needs ReviewPublic

Authored by sepavloff on Sep 30 2014, 10:24 AM.

Download Raw Diff

This revision needs review, but there are no reviewers specified.

Details

Reviewers: None

Summary

The pass scans machine instructions looking for uses of immediate
values. If the same value is used in neighbouring instructions, it
is moved to a register, if this is possible and reduces code size.
For instance, instructions

mov $0, 0x4(%esi)
mov $0, 0x8(%esi)

can be replaced by

mov $0, %eax
mov %eax, 0x4(%esi)
mov %eax, 0x8(%esi)

which is shorter in code size.

This patch fixes PR5124, it uses feedback on the previous patch discussed in the thread
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130916/188079.html .

Diff Detail

Event Timeline

sepavloff updated this revision to Diff 14236.Sep 30 2014, 10:24 AM

sepavloff retitled this revision from to [X86] New pass that moves immediate operands to registers..

sepavloff updated this object.

sepavloff edited the test plan for this revision. (Show Details)

sepavloff added a subscriber: Unknown Object (MLST).

Shall we do this carefully with consideration of register pressure?
Even it reduces the code size, but it increases register pressure. Do
you have benchmark results on the impact of this pass?

Michael

Could you take a look at http://llvm.org/bugs/show_bug.cgi?id=9517 and see if your pass could help with that?

Unfortunately no. When this pass runs asm statements are not expanded yet.

Thanks,
--Serge

2014-10-01 13:58 GMT+07:00 Joerg Sonnenberger <joerg@netbsd.org>:

Could you take a look at http://llvm.org/bugs/show_bug.cgi?id=9517 and
see if your pass could help with that?

http://reviews.llvm.org/D5544

Hi Serge,

A couple of general comments:

None of the comment uses the doxygen style: / instead of . Please fix it.
Comments start with a capital letter and end with a period (or any appropriate punctuation :)).
As I said in one of my email, we should do that just for Os and Oz functions (i.e., look for function attribute: OptimizeForSize and MinSize).

The overall approach seems more complicated than I would expect for such a pass. In particular, I think that all the InReg, Scheduled, and Waiting states could be avoided, as well as all the history thing with the expiration date.
My understanding is that you try to limit the impact of this transformation on the register pressure. That seems fragile to me (there are a lot of magic numbers involved here for the different limits) and harder than it should be to understand.

That I would recommend is having more faith in the register allocator. The live-ranges, you are creating/extending are rematerializable. Therefore, if the register pressure becomes too high, the register allocator can theoritically rematerialize the values instead of spilling.
The main problem then, is that you may end up with code like this:
reg = mov imm
reg2 = op reg3, reg
instead of
reg2 = op reg3, imm

That said, we could probably teach the register allocator to “fold” the operand as we do for memory load, so I wouldn’t worry too much about that.

The bottom line is that I would expect the pass to be a relatively simple scan that creates appropriate constants when the whole basic block has been traversed.
I.e., something like:
ConstantsUsage // Sort of dictionary. Map a constant to its uses.
for each instr in bb

if instr can use register variant
  record instr, instr.imm in ConstantsUsage // This is could be the trick part. See below for some thoughts.

for each constant in ConstantsUsage

if constant is profitable
  materialize constant // This creates one constant and update all its uses using the appropriate subreg, removing the redundant load imm, etc.

Regarding ‘record instr, imm, ConstantsUsage’.
The hard part, per say, is to enable reuse of wide constant, e.g., 0x0fff is used for 0xff.
I believe that you wouldn’t have that many constants to look through per basic block and a linear search may be sufficient.
Now, if we want something faster, we could do some trick like having a mapping per size, sorted by profitability, etc.

What do you think?

BTW, you could also have some function level scope if you really are interested at saving as much size as possible.

Cheers,
-Quentin

lib/Target/X86/X86MaterializeImmediates.cpp
10	s/resister/register
32	I believe this should come after the includes. In other words, usually right after using namespace llvm.
50	I believe most users won’t mess with the related option and I rather have the number I asked on the command line than a hidden maximal value. The bottom line is, please do not use a hard-coded maximal value here.
54	Ditto.
58	Ditto.
62	Ditto.
66	Ditto.
72	Init this with the maximal value.
79	Ditto.
88	Ditto.
93	Ditto.
98	Ditto.
112	I would explicitly set the value, to be sure anyone updating this code matches the intent.
130	Period at the end of the comments.
134	IsLoadImm maybe?
135	For this patch I think it is fine to have all the information written here, but in the future it would be nice if it could be part of tablegen. I do not like having yet another place to update when I add instructions.
241	Could you add a comment on that function? I do not understand its purpose with just its name.
243	What do these comments mean?
253	Shouldn’t we assert Entire is bigger than Part?
269	Ditto.
795	nullptr instead of 0.
947	expences -> expenses.

Hi Quentin,

Thank you very much for your detailed review.

2014-10-04 4:34 GMT+07:00 Quentin Colombet <qcolombet@apple.com>:

Hi Serge,

A couple of general comments:

None of the comment uses the doxygen style: / instead of . Please

fix it.

Comments start with a capital letter and end with a period (or any

appropriate punctuation :)).

As I said in one of my email, we should do that just for Os and Oz

functions (i.e., look for function attribute: OptimizeForSize and MinSize).

OK, will fix these.

The overall approach seems more complicated than I would expect for such a
pass. In particular, I think that all the InReg, Scheduled, and Waiting
states could be avoided, as well as all the history thing with the
expiration date.

History was introduced to track only "fresh" immediate values. A value that
is not used for a long time would occupy memory and would increase search
time, with long basic block these might be substantial. On the other hand,
only values used in close vicinity of the current instruction should be
taken into account. History solves the problem of maintaining compact and
actual pool of values. It does not require memory allocation and has
predictable search time. This pass was made as a kind of digital filter
that processes instruction flow, terms 'history' and 'time' were borrowed
from there. Using value history is attractive due to guaranty of limited
expenses in memory and execution time.

As for InReg, Scheduled and Waiting states, they exist to postpone
decision, what register class is to be used to store immediate. If we see
that some 32 bit value is profitable to materialize but then see use of 64
bit values, lower half of which is that 32 bit value, we could use part of
64-bit value, if we inserted appropriate load instruction. It looks like
this code indeed may be simplified.

My understanding is that you try to limit the impact of this
transformation on the register pressure. That seems fragile to me (there
are a lot of magic numbers involved here for the different limits) and
harder than it should be to understand.

That I would recommend is having more faith in the register allocator. The
live-ranges, you are creating/extending are rematerializable. Therefore, if
the register pressure becomes too high, the register allocator can
theoritically rematerialize the values instead of spilling.
The main problem then, is that you may end up with code like this:
reg = mov imm
reg2 = op reg3, reg
instead of
reg2 = op reg3, imm

That said, we could probably teach the register allocator to “fold” the
operand as we do for memory load, so I wouldn’t worry too much about that.

Yes, this is good idea, to teach register allocator about spilling
registers loaded from immediates. If register allocator could handle such
register enough flexibly, it would minimize impact of this pass on register
pressure and some magic constants in this pass could be avoided.

The bottom line is that I would expect the pass to be a relatively simple
scan that creates appropriate constants when the whole basic block has been
traversed.
I.e., something like:
ConstantsUsage // Sort of dictionary. Map a constant to its uses.
for each instr in bb
if instr can use register variant
  record instr, instr.imm in ConstantsUsage // This is could be the
trick part. See below for some thoughts.

for each constant in ConstantsUsage
if constant is profitable
  materialize constant // This creates one constant and update all its
uses using the appropriate subreg, removing the redundant load imm, etc.

In this approach we have full information about immediate use at the end of
basic block, this simplifies decision about materialization. The only
drawback is higher resource consumption, both memory (need to keep all
values in BB) and execution time (make two passes instead one). Maybe
enhancements in register allocator can alleviate this problem. For
instance, all values that could be profitable are loaded into virtual
registers and the register allocator decides which values should be
converted into immediates, it anyway use heuristics to map physical
registers.

Regarding ‘record instr, imm, ConstantsUsage’.

The hard part, per say, is to enable reuse of wide constant, e.g., 0x0fff
is used for 0xff.

It is expected in memset expansion. In other cases probably this is rare
case.

I believe that you wouldn’t have that many constants to look through per
basic block and a linear search may be sufficient.
Now, if we want something faster, we could do some trick like having a
mapping per size, sorted by profitability, etc.

What do you think?

For me using history is attractive due to predictable resource consumption,
but probably in most cases a map is enough, this needs investigation.

BTW, you could also have some function level scope if you really are
interested at saving as much size as possible.

Looks like a job for RegisterCoalescer?

Cheers,
-Quentin

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:10
@@ +9,3 @@
+
+ This file defines the pass which will move immediate operand into a
resister
+// if it is used in several adjacent instructions. This transformation

tries to

s/resister/register

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:32
@@ +31,3 @@
+
+#define DEBUG_TYPE "x86-imm2reg"

+#include "X86.h"

I believe this should come after the includes. In other words, usually
right after using namespace llvm.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:50
@@ +49,3 @@
+// may be set smaller by supplying proper command line option.
+const unsigned MaxHistoryDepth = 8;

+

I believe most users won’t mess with the related option and I rather have
the number I asked on the command line than a hidden maximal value.
The bottom line is, please do not use a hard-coded maximal value here.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:54
@@ +53,3 @@
+// may be set smaller by supplying proper command line option.
+const unsigned MaxHistoryWidth = 4;

+

Ditto.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:58
@@ +57,3 @@
+// immediate value before it is dropped from history.
+const unsigned DefMaxSeparatingInstrs = MaxHistoryWidth;

+

Ditto.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:62
@@ +61,3 @@
+// values.
+const unsigned DefMaxRegisters = 2;

+

Ditto.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:66
@@ +65,3 @@
+// register.
+const unsigned DefMinProfit = 1;

+

Ditto.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:72
@@ +71,3 @@
+ cl::desc("Max number of immediate values tracked in history."),
+ cl::init(MaxHistoryWidth), cl::Hidden);

+

Init this with the maximal value.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:79
@@ +78,3 @@
+ cl::desc("Max number of uses stored for a tracked immediate value."),
+ cl::init(MaxHistoryDepth), cl::Hidden);

+

Ditto.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:88
@@ +87,3 @@
+ cl::desc("Max number of instructions separating two immediate uses."),
+ cl::init(DefMaxSeparatingInstrs), cl::Hidden);

+

Ditto.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:93
@@ +92,3 @@
+ cl::desc("Max number of registers used for
materialization."),
+ cl::init(DefMaxRegisters), cl::Hidden);

+

Ditto.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:98
@@ +97,3 @@
+ cl::desc("Minimal number of bytes that materialization must save."),
+ cl::init(DefMinProfit), cl::Hidden);

+

Ditto.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:112
@@ +111,3 @@
+ // Must be 1 << DATA_N == size in bytes
+ DATA_8,

+ DATA_16,

I would explicitly set the value, to be sure anyone updating this code
matches the intent.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:130
@@ +129,3 @@
+ int Opcode;
+ int NewOpCode; // Opcode if imm is replaced by reg

+ ImmediateSize Size; // of immediate data

Period at the end of the comments.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:134
@@ +133,3 @@
+ int Profit; Gain in bytes if imm is changed to reg
+ bool IsLoad; is this an instruction like 'mov imm, reg'?

+};

IsLoadImm maybe?

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:135
@@ +134,3 @@
+ bool IsLoad; // is this an instruction like 'mov imm, reg'?
+};

+

For this patch I think it is fine to have all the information written
here, but in the future it would be nice if it could be part of tablegen.
I do not like having yet another place to update when I add instructions.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:241
@@ +240,3 @@
+
+static int GetSizeOfMovImmToRegInstr(int OpCode) {

+ if (OpCode == X86::MOV16ri)

Could you add a comment on that function?
I do not understand its purpose with just its name.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:243
@@ +242,3 @@
+ if (OpCode == X86::MOV16ri)
+ return 4; // 66 B8 iw

+ if (OpCode == X86::MOV32ri)

What do these comments mean?

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:253
@@ +252,3 @@
+
+static unsigned GetSubreg(ImmediateSize Entire, ImmediateSize Part) {

+ if (Entire == Part)

Shouldn’t we assert Entire is bigger than Part?

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:269
@@ +268,3 @@
+static unsigned GetSubregByBytes(unsigned Entire, unsigned Part) {
+ if (Entire == Part)

+ return X86::NoSubRegister;

Ditto.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:795
@@ +794,3 @@
+ InstrInfo *Rec = OpCodeTable::Find(Instr.getOpcode());
+ if (Rec == 0)

+ return false;

nullptr instead of 0.

Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:947
@@ +946,3 @@
+
+ // If use set of the value contains load instruction, expences for load

+ // instruction can be decreased.

expences -> expenses.

http://reviews.llvm.org/D5544

spatel mentioned this in D11363: Allow merging of immediates within a basic block for code size savings and reduced footprint..Jul 31 2015, 10:59 AM

jevinskie added a subscriber: jevinskie.Aug 11 2015, 2:39 PM

Revision Contents

Path

Size

lib/

Target/

X86/

CMakeLists.txt

1 line

X86.h

4 lines

X86MaterializeImmediates.cpp

1052 lines

X86TargetMachine.cpp

4 lines

test/

CodeGen/

X86/

coalescer-commute3.ll

2 lines

4 lines

444 lines

77 lines

29 lines

memset64-on-x86-32.ll

4 lines

8 lines

5 lines

5 lines

5 lines

12 lines

DebugInfo/

X86/

debug-loc-offset.ll

2 lines

Diff 14236

lib/Target/X86/CMakeLists.txt

Show All 16 Lines	set(sources
X86FastISel.cpp		X86FastISel.cpp
X86FloatingPoint.cpp		X86FloatingPoint.cpp
X86FrameLowering.cpp		X86FrameLowering.cpp
X86ISelDAGToDAG.cpp		X86ISelDAGToDAG.cpp
X86ISelLowering.cpp		X86ISelLowering.cpp
X86InstrInfo.cpp		X86InstrInfo.cpp
X86MCInstLower.cpp		X86MCInstLower.cpp
X86MachineFunctionInfo.cpp		X86MachineFunctionInfo.cpp
		X86MaterializeImmediates.cpp
X86PadShortFunction.cpp		X86PadShortFunction.cpp
X86RegisterInfo.cpp		X86RegisterInfo.cpp
X86SelectionDAGInfo.cpp		X86SelectionDAGInfo.cpp
X86Subtarget.cpp		X86Subtarget.cpp
X86TargetMachine.cpp		X86TargetMachine.cpp
X86TargetObjectFile.cpp		X86TargetObjectFile.cpp
X86TargetTransformInfo.cpp		X86TargetTransformInfo.cpp
X86VZeroUpper.cpp		X86VZeroUpper.cpp
Show All 21 Lines

lib/Target/X86/X86.h

	Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
	/// with NOOPs. This will prevent a stall when returning on the Atom.			/// with NOOPs. This will prevent a stall when returning on the Atom.
	FunctionPass *createX86PadShortFunctions();			FunctionPass *createX86PadShortFunctions();
	/// createX86FixupLEAs - Return a a pass that selectively replaces			/// createX86FixupLEAs - Return a a pass that selectively replaces
	/// certain instructions (like add, sub, inc, dec, some shifts,			/// certain instructions (like add, sub, inc, dec, some shifts,
	/// and some multiplies) by equivalent LEA instructions, in order			/// and some multiplies) by equivalent LEA instructions, in order
	/// to eliminate execution delays in some Atom processors.			/// to eliminate execution delays in some Atom processors.
	FunctionPass *createX86FixupLEAs();			FunctionPass *createX86FixupLEAs();

				/// \brief Creates a pass that moves constants used as immediates into
				/// registers, if this reduces code size.
				FunctionPass *createX86MaterializeImmediates();

	} // End llvm namespace			} // End llvm namespace

	#endif			#endif

lib/Target/X86/X86MaterializeImmediates.cpp

This file was added.

				//===-------- X86MaterializeImmediate.cpp - move immediate to register ----===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the pass which will move immediate operand into a resister
				qcolombetUnsubmitted Not Done Reply Inline Actions s/resister/register qcolombet: s/resister/register
				// if it is used in several adjacent instructions. This transformation tries to
				// reduce code size by using shorter instructions.
				//
				//
				// Decision, whether the instruction with immediate value should be replaced by
				// its register form, is made by analysis of the immediate usage history. The
				// pass scans instructions in basic blocks, when it find an instruction that
				// uses immediate value and has equivalent form with register, it records info
				// about it into the history. Sequential number of the instruction in basic
				// block plays a role of time. The history is organized as a matrix, one
				// dimension is "time", the other corresponds to the value of immediate
				// operand. The history is limited in both dimensions, if there is no room for
				// new immediate use, the oldest one is dropped.
				//
				// Each immediate value in the history is thus represented by series of its
				// uses. Based on this set the profit of moving the value to register is
				// calculated. If the size gain is enough, the series of the instructions may be
				// replaced by load instruction and series of instructions using the register.
				//
				//===----------------------------------------------------------------------===//

				#define DEBUG_TYPE "x86-imm2reg"
				qcolombetUnsubmitted Not Done Reply Inline Actions I believe this should come after the includes. In other words, usually right after using namespace llvm. qcolombet: I believe this should come after the includes. In other words, usually right after using…
				#include "X86.h"
				#include "X86InstrInfo.h"
				#include "X86Subtarget.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/CodeGen/MachineFunctionPass.h"
				#include "llvm/CodeGen/MachineInstrBuilder.h"
				#include "llvm/CodeGen/MachineRegisterInfo.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/IR/Function.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/raw_ostream.h"
				#include "llvm/Target/TargetInstrInfo.h"
				using namespace llvm;

				// Maximal number of uses kept for an immediate value in history. Actual number
				// may be set smaller by supplying proper command line option.
				const unsigned MaxHistoryDepth = 8;
				qcolombetUnsubmitted Not Done Reply Inline Actions I believe most users won’t mess with the related option and I rather have the number I asked on the command line than a hidden maximal value. The bottom line is, please do not use a hard-coded maximal value here. qcolombet: I believe most users won’t mess with the related option and I rather have the number I asked on…

				// Maximal number of immediate values that history can track. Actual number
				// may be set smaller by supplying proper command line option.
				const unsigned MaxHistoryWidth = 4;
				qcolombetUnsubmitted Not Done Reply Inline Actions Ditto. qcolombet: Ditto.

				// Default maximal number of instruction allowed after the last use of an
				// immediate value before it is dropped from history.
				const unsigned DefMaxSeparatingInstrs = MaxHistoryWidth;
				qcolombetUnsubmitted Not Done Reply Inline Actions Ditto. qcolombet: Ditto.

				// Default maximal number of registers that may be used for caching immediate
				// values.
				const unsigned DefMaxRegisters = 2;
				qcolombetUnsubmitted Not Done Reply Inline Actions Ditto. qcolombet: Ditto.

				// Default minimal gain in bytes, necessary for moving immediate value to a
				// register.
				const unsigned DefMinProfit = 1;
				qcolombetUnsubmitted Not Done Reply Inline Actions Ditto. qcolombet: Ditto.

				// Max number of immediate values tracked simultaneously.
				static cl::opt<unsigned> MaxImmediates(
				"x86-i2r-histwidth",
				cl::desc("Max number of immediate values tracked in history."),
				cl::init(MaxHistoryWidth), cl::Hidden);
				qcolombetUnsubmitted Not Done Reply Inline Actions Init this with the maximal value. qcolombet: Init this with the maximal value.

				// Max number of usages tracked for particular immediate value which is not
				// moved it into a register yet.
				static cl::opt<unsigned> HistoryDepth(
				"x86-i2r-histdepth",
				cl::desc("Max number of uses stored for a tracked immediate value."),
				cl::init(MaxHistoryDepth), cl::Hidden);
				qcolombetUnsubmitted Not Done Reply Inline Actions Ditto. qcolombet: Ditto.

				// Maximal number of instructions that may separate consecutive uses of the
				// same immediate value. If last instructions that uses particular immediate
				// value is followed by more instructions that do not, this value is removed
				// from tracking, register is freed.
				static cl::opt<unsigned> MaxSeparatingInstrs(
				"x86-i2r-maxsep",
				cl::desc("Max number of instructions separating two immediate uses."),
				cl::init(DefMaxSeparatingInstrs), cl::Hidden);
				qcolombetUnsubmitted Not Done Reply Inline Actions Ditto. qcolombet: Ditto.

				static cl::opt<unsigned>
				MaxRegisters("x86-i2r-regs",
				cl::desc("Max number of registers used for materialization."),
				cl::init(DefMaxRegisters), cl::Hidden);
				qcolombetUnsubmitted Not Done Reply Inline Actions Ditto. qcolombet: Ditto.

				static cl::opt<int> MinProfit(
				"x86-i2r-profit",
				cl::desc("Minimal number of bytes that materialization must save."),
				cl::init(DefMinProfit), cl::Hidden);
				qcolombetUnsubmitted Not Done Reply Inline Actions Ditto. qcolombet: Ditto.

				STATISTIC(NumMatImmediates, "Number of materialized constants");
				STATISTIC(NumMatInstructions, "Number of instructions with materialized imms");
				STATISTIC(NumMatBytes, "Number of bytes saved by materialization");
				STATISTIC(NumMatDropped, "Number of value uses dropped from history");

				namespace {

				typedef MachineBasicBlock::iterator InstrIterator;

				// Enumerates possible size of immediate operand.
				enum ImmediateSize {
				// Must be 1 << DATA_N == size in bytes
				DATA_8,
				qcolombetUnsubmitted Not Done Reply Inline Actions I would explicitly set the value, to be sure anyone updating this code matches the intent. qcolombet: I would explicitly set the value, to be sure anyone updating this code matches the intent.
				DATA_16,
				DATA_32,
				DATA_64,
				DATA_Total
				};

				static unsigned SizeInBytes(ImmediateSize Sz) {
				assert(Sz < DATA_Total);
				return 1U << Sz;
				}

				// Information about instruction that uses immediate value.
				//
				// This information depends on instruction opcode only, this type is used for
				// table elements that reside in read-only memory.
				struct InstrInfo {
				int Opcode;
				int NewOpCode; // Opcode if imm is replaced by reg
				qcolombetUnsubmitted Not Done Reply Inline Actions Period at the end of the comments. qcolombet: Period at the end of the comments.
				ImmediateSize Size; // of immediate data
				int OperandNo; // Argument number of immediate operand
				int Profit; // Gain in bytes if imm is changed to reg
				bool IsLoad; // is this an instruction like 'mov imm, reg'?
				qcolombetUnsubmitted Not Done Reply Inline Actions IsLoadImm maybe? qcolombet: IsLoadImm maybe?
				};
				qcolombetUnsubmitted Not Done Reply Inline Actions For this patch I think it is fine to have all the information written here, but in the future it would be nice if it could be part of tablegen. I do not like having yet another place to update when I add instructions. qcolombet: For this patch I think it is fine to have all the information written here, but in the future…

				// Interface to database of instructions that have immediate operand and can
				// be transformed to register form.
				class OpCodeTable {
				static InstrInfo Table[];
				static unsigned TableSize;
				static bool Sorted;

				static bool OpCodeCmp(const InstrInfo &A1, const InstrInfo &A2) {
				return A1.Opcode < A2.Opcode;
				}
				static bool OpCodeFind(const InstrInfo &A1, int OpCode) {
				return A1.Opcode < OpCode;
				}

				public:
				static InstrInfo *begin() { return Table; }
				static InstrInfo *end() { return Table + TableSize; }

				static void Sort() {
				if (!Sorted) {
				std::sort(begin(), end(), OpCodeCmp);
				Sorted = true;
				}
				}

				static InstrInfo *Find(int OpCode) {
				InstrInfo *P = std::lower_bound(begin(), end(), OpCode, OpCodeFind);
				if (P == end())
				return nullptr;
				if (P->Opcode != OpCode)
				return nullptr;
				return P;
				}
				};

				// Database of instructions with immediate operand.
				InstrInfo OpCodeTable::Table[] = {
				{X86::MOV16ri, X86::MOV16rr, DATA_16, 1, 1, 1}, // 66 B8 iw -> 66 89 /r
				{X86::MOV32ri, X86::MOV32rr, DATA_32, 1, 3, 1}, // B8 id -> 89 /r
				{X86::MOV64ri, X86::MOV64rr, DATA_64, 1, 7, 1}, // RX B8 io -> RX 89 /r
				{X86::MOV64ri32, X86::MOV64rr, DATA_64, 1, 4, 1}, // RX C7 /0 id -> RX 89 /r
				{X86::MOV8mi, X86::MOV8mr, DATA_8, 5, 1}, // C6 /0 ib -> 88 /r
				{X86::MOV16mi, X86::MOV16mr, DATA_16, 5, 2}, // 66 C7 /0 iw -> 66 89 /r
				{X86::MOV32mi, X86::MOV32mr, DATA_32, 5, 4}, // C7 /0 id -> 89 /r
				{X86::MOV64mi32, X86::MOV64mr, DATA_64, 5, 4}, // RX C7 /0 id -> RX 89 /r
				{X86::ADD8mi, X86::ADD8mr, DATA_8, 5, 1}, // 80 /0 ib -> 00 /r
				{X86::ADD16mi, X86::ADD16mr, DATA_16, 5, 2}, // 66 81 /0 iw -> 66 01 /r
				{X86::ADD32mi, X86::ADD32mr, DATA_32, 5, 4}, // 81 /0 id -> 01 /r
				{X86::ADD64mi32, X86::ADD64mr, DATA_64, 5, 4}, // RX 81 /0 id -> RX 01 /r
				{X86::ADD16mi8, X86::ADD16mr, DATA_16, 5, 1}, // 66 83 /0 ib -> 66 01 /r
				{X86::ADD32mi8, X86::ADD32mr, DATA_32, 5, 1}, // 83 /0 ib -> 01 /r
				{X86::ADD64mi8, X86::ADD64mr, DATA_64, 5, 1}, // RX 83 /0 ib -> RX 01 /r
				{X86::ADC8mi, X86::ADC8mr, DATA_8, 5, 1},
				{X86::ADC16mi, X86::ADC16mr, DATA_16, 5, 2},
				{X86::ADC32mi, X86::ADC32mr, DATA_32, 5, 4},
				{X86::ADC64mi32, X86::ADC64mr, DATA_64, 5, 4},
				{X86::ADC16mi8, X86::ADC16rm, DATA_16, 5, 1},
				{X86::ADC32mi8, X86::ADC32mr, DATA_32, 5, 1},
				{X86::ADC64mi8, X86::ADC64mr, DATA_64, 5, 1},
				{X86::SUB8mi, X86::SUB8mr, DATA_8, 5, 1},
				{X86::SUB16mi, X86::SUB16mr, DATA_16, 5, 2},
				{X86::SUB32mi, X86::SUB32mr, DATA_32, 5, 4},
				{X86::SUB64mi32, X86::SUB64mr, DATA_64, 5, 4},
				{X86::SUB16mi8, X86::SUB16mr, DATA_16, 5, 1},
				{X86::SUB32mi8, X86::SUB32mr, DATA_32, 5, 1},
				{X86::SUB64mi8, X86::SUB64mr, DATA_64, 5, 1},
				{X86::SBB8mi, X86::SBB8mr, DATA_8, 5, 1},
				{X86::SBB16mi, X86::SBB16mr, DATA_16, 5, 2},
				{X86::SBB32mi, X86::SBB32mr, DATA_32, 5, 4},
				{X86::SBB64mi32, X86::SBB64mr, DATA_64, 5, 4},
				{X86::SBB16mi8, X86::SBB16mr, DATA_16, 5, 1},
				{X86::SBB32mi8, X86::SBB32mr, DATA_32, 5, 1},
				{X86::SBB64mi8, X86::SBB64mr, DATA_64, 5, 1},
				{X86::AND8mi, X86::AND8mr, DATA_8, 5, 1},
				{X86::AND16mi, X86::AND16mr, DATA_16, 5, 2},
				{X86::AND32mi, X86::AND32mr, DATA_32, 5, 4},
				{X86::AND64mi32, X86::AND64mr, DATA_64, 5, 4},
				{X86::AND16mi8, X86::AND16mr, DATA_16, 5, 1},
				{X86::AND32mi8, X86::AND32mr, DATA_32, 5, 1},
				{X86::AND64mi8, X86::AND64mr, DATA_64, 5, 1},
				{X86::OR8mi, X86::OR8mr, DATA_8, 5, 1},
				{X86::OR16mi, X86::OR16mr, DATA_16, 5, 2},
				{X86::OR32mi, X86::OR32mr, DATA_32, 5, 4},
				{X86::OR64mi32, X86::OR64mr, DATA_64, 5, 4},
				{X86::OR16mi8, X86::OR16mr, DATA_16, 5, 1},
				{X86::OR32mi8, X86::OR32mr, DATA_32, 5, 1},
				{X86::OR64mi8, X86::OR64mr, DATA_64, 5, 1},
				{X86::XOR8mi, X86::XOR8mr, DATA_8, 5, 1},
				{X86::XOR16mi, X86::XOR16mr, DATA_16, 5, 2},
				{X86::XOR32mi, X86::XOR32mr, DATA_32, 5, 4},
				{X86::XOR64mi32, X86::XOR64mr, DATA_64, 5, 4},
				{X86::XOR16mi8, X86::XOR16mr, DATA_16, 5, 1},
				{X86::XOR32mi8, X86::XOR32mr, DATA_32, 5, 1},
				{X86::XOR64mi8, X86::XOR64mr, DATA_64, 5, 1},
				{X86::CMP8mi, X86::CMP8mr, DATA_8, 5, 1},
				{X86::CMP16mi, X86::CMP16mr, DATA_16, 5, 2},
				{X86::CMP32mi, X86::CMP32mr, DATA_32, 5, 4},
				{X86::CMP64mi32, X86::CMP64mr, DATA_64, 5, 4},
				{X86::CMP16mi8, X86::CMP16mr, DATA_16, 5, 1},
				{X86::CMP32mi8, X86::CMP32mr, DATA_32, 5, 1},
				{X86::CMP64mi8, X86::CMP64mr, DATA_64, 5, 1}};
				unsigned OpCodeTable::TableSize = array_lengthof(OpCodeTable::Table);
				bool OpCodeTable::Sorted = false;

				static int GetSizeOfMovImmToRegInstr(int OpCode) {
				qcolombetUnsubmitted Not Done Reply Inline Actions Could you add a comment on that function? I do not understand its purpose with just its name. qcolombet: Could you add a comment on that function? I do not understand its purpose with just its name.
				if (OpCode == X86::MOV16ri)
				return 4; // 66 B8 iw
				qcolombetUnsubmitted Not Done Reply Inline Actions What do these comments mean? qcolombet: What do these comments mean?
				if (OpCode == X86::MOV32ri)
				return 5; // B8 id
				if (OpCode == X86::MOV64ri)
				return 10; // RX B8 io
				if (OpCode == X86::MOV64ri32)
				return 7; // RX C7 /0 id
				llvm_unreachable("Not mov imm->reg instruction");
				}

				static unsigned GetSubreg(ImmediateSize Entire, ImmediateSize Part) {
				qcolombetUnsubmitted Not Done Reply Inline Actions Shouldn’t we assert Entire is bigger than Part? qcolombet: Shouldn’t we assert Entire is bigger than Part?
				if (Entire == Part)
				return X86::NoSubRegister;
				switch (Part) {
				case DATA_8:
				return X86::sub_8bit;
				case DATA_16:
				return X86::sub_16bit;
				case DATA_32:
				return X86::sub_32bit;
				default:
				llvm_unreachable("Invalid size");
				}
				}

				static unsigned GetSubregByBytes(unsigned Entire, unsigned Part) {
				if (Entire == Part)
				qcolombetUnsubmitted Not Done Reply Inline Actions Ditto. qcolombet: Ditto.
				return X86::NoSubRegister;
				switch (Part) {
				case 1:
				return X86::sub_8bit;
				case 2:
				return X86::sub_16bit;
				case 4:
				return X86::sub_32bit;
				default:
				llvm_unreachable("Invalid size");
				}
				}

				// Possible results of the test if 64-bit immediate value can be represented
				// as extended 32-bit value.
				enum ExtensionKind {
				NoExtension, // cannot be represented as extension of 32 bit
				ZeroExtension, // 64-bit value = ZExt(32-bit value)
				SignExtension // 64-bit value = SExt(32-bit value)
				};

				// Pair of immediate value and instruction that use it.
				//
				// This type is used for temporary values which are not placed into
				// history yet.
				class ImmediateValueUse {
				int64_t Value;
				const InstrInfo *Info;

				public:
				void Init(int64_t V, const InstrInfo *II) {
				assert(II);
				Value = V;
				Info = II;
				}

				const InstrInfo &GetInstrInfo() const { return *Info; }
				int64_t GetValue() const { return Value; }
				ImmediateSize GetSize() const { return Info->Size; }
				int GetOperand() const { return Info->OperandNo; }

				uint32_t Get32bits() const { return (uint32_t)Value; }
				uint16_t Get16bits() const { return (uint16_t)Value; }
				uint8_t Get8bits() const { return (uint8_t)Value; }
				};

				// Represents use of an immediate value in history.
				//
				// Immediate value in history is kept separately, so the use contains only
				// info about using instruction.
				class RecordedUse {
				InstrIterator Instruction; // that uses the value
				unsigned SerialNumber; // of the instruction
				const InstrInfo *IInfo; // Associated info about imm usage

				public:
				void Init(InstrIterator I, unsigned SN, const InstrInfo &II) {
				Instruction = I;
				SerialNumber = SN;
				IInfo = &II;
				}

				InstrIterator GetInstr() const { return Instruction; }
				unsigned GetSN() const { return SerialNumber; }
				int GetProfit() const { return IInfo->Profit; }
				ImmediateSize GetSize() const { return IInfo->Size; }
				int GetOperand() const { return IInfo->OperandNo; }
				int GetNewOpCode() const { return IInfo->NewOpCode; }
				bool IsLoad() const { return IInfo->IsLoad; }
				};

				// Represents an immediate value in history.
				struct RecordedValue {

				// Iterating through value is iterating through its uses.
				typedef RecordedUse *Iterator;
				Iterator begin() { return Uses; }
				Iterator end() { return Uses + TotalUses; }

				// Enumerates possible states of a value slot in history.
				enum ValueState {
				Free, // Does not track any immediate value
				Used, // Value is not profitable to materialize yet
				Waiting, // Is profitable, but no registers are available
				Scheduled, // Selected for materialization
				InReg, // Materialization in progress
				};

				ValueState State;
				int64_t Value;
				unsigned TotalUses; // Number of recorded uses
				RecordedUse Uses[MaxHistoryDepth]; // Recorded uses of this value
				unsigned SizeCnt[DATA_Total]; // Counts uses of different size
				InstrIterator FirstInstruction; // that uses the immediate value
				bool HasLoad;
				Iterator LoadInstruction; // Instr like MOV32ri, if exists
				int Profit; // Gain in size (bytes)
				unsigned ReplacedCnt; // Counts transformations imm->reg
				unsigned Register; // If the value is cached in reg
				ImmediateSize RegSize; // Size of allocated register
				bool UsesREX; // Register requires prefix to access

				// State query
				bool IsFree() const { return State == Free; }
				bool IsUsed() const { return State == Used; }
				bool IsWaiting() const { return State == Waiting; }
				bool IsInReg() const { return State == InReg; }
				bool IsScheduled() const { return State == Scheduled; }
				bool IsFull() const { return TotalUses == HistoryDepth; }

				// State change
				void MarkWaiting() {
				assert(State == Used \|\| State == Waiting);
				State = Waiting;
				}
				void MarkSchedule() {
				assert(State == Used \|\| State == Waiting);
				State = Scheduled;
				Register = 0;
				}
				void MarkInReg(int Reg, bool UseREX) {
				assert(State == Scheduled);
				assert(Reg != 0);
				State = InReg;
				Register = Reg;
				UsesREX = UseREX;
				}
				void MarkFree() { State = Free; }

				int64_t GetValue() const { return Value; }
				ImmediateSize GetSize() const { return RegSize; }

				uint32_t Get32bits() const {
				assert(State != Free);
				return (uint32_t)Value;
				}
				uint16_t Get16bits() const {
				assert(State != Free);
				return (uint16_t)Value;
				}
				uint8_t Get8bits() const {
				assert(State != Free);
				return (uint8_t)Value;
				}

				ExtensionKind ExtensionOf32() const {
				if (SizeCnt[DATA_64] == 0)
				return NoExtension;
				if ((int64_t)(Get32bits()) == Value)
				return ZeroExtension;
				if ((int64_t)(int32_t)Get32bits() == Value)
				return SignExtension;
				return NoExtension;
				}

				// Returns size in bytes of the instruction that loads immediate value into
				// a register. On 64 bit target the size is calculated for loading into a
				// register, that do not require REX prefix. If the instruction anyway
				// requires REX prefix, the argument WithREX is set to 'true'.
				int GetLoadInstructionSize(bool &WithREX) const {
				WithREX = false;
				if (SizeCnt[DATA_64]) {
				switch (ExtensionOf32()) {
				case ZeroExtension:
				return 5; // B8 id
				case SignExtension:
				WithREX = true;
				return 7; // REX C7 /0 id
				default:
				WithREX = true;
				return 10; // REX B8 io
				}
				}
				if (SizeCnt[DATA_32])
				return 5; // B8 id
				if (SizeCnt[DATA_16])
				return 4; // 66 B8 iw
				return 2; // B0 ib
				}

				// Returns opcode of instruction that loads immediate value into register.
				int GetLoadInstructionOpcode() const {
				if (SizeCnt[DATA_64]) {
				switch (ExtensionOf32()) {
				case ZeroExtension:
				return X86::MOV32ri64;
				case SignExtension:
				return X86::MOV64ri32;
				default:
				return X86::MOV64ri;
				}
				}
				if (SizeCnt[DATA_32])
				return X86::MOV32ri;
				if (SizeCnt[DATA_16])
				return X86::MOV16ri;
				return X86::MOV8ri;
				}

				// Returns gain in size if this value is moved to a register.
				// If materialization requires 32 bit or shorter register, the profit is
				// calculated for resisters that don't require REX prefix to access.
				int GetProfit() {
				bool WithREX;
				return Profit - GetLoadInstructionSize(WithREX);
				}

				// Returns number of instructions the value history spans over.
				unsigned Length() {
				assert(State != Free);
				return LastUse().GetSN() - FirstUse().GetSN();
				}

				RecordedUse &LastUse() {
				assert(State != Free);
				return Uses[TotalUses - 1];
				}

				RecordedUse &FirstUse() {
				assert(State != Free);
				return Uses[0];
				}

				void Init(int64_t V) {
				assert(State == Free);
				State = Used;
				TotalUses = 0;
				for (unsigned I = 0; I < DATA_Total; ++I)
				SizeCnt[I] = 0;
				Value = V;
				RegSize = DATA_8;
				Register = 0;
				Profit = 0;
				ReplacedCnt = 0;
				HasLoad = false;
				}

				// Creates new use of the value in history.
				void NewUse(InstrIterator I, unsigned SN, const InstrInfo &IInfo) {
				assert(!IsFree());
				assert(TotalUses < HistoryDepth);
				Iterator NewUseItem = end();
				++TotalUses;
				NewUseItem->Init(I, SN, IInfo);
				Profit += IInfo.Profit;
				++SizeCnt[IInfo.Size];
				if (IInfo.Size > RegSize)
				RegSize = IInfo.Size;

				if (IInfo.IsLoad) {
				// If the instruction loads immediate into register, it may be moved
				// ahead in basic block. No additional load instruction is needed in
				// this case.
				if (!HasLoad \|\| LoadInstruction->GetSize() <= IInfo.Size) {
				HasLoad = true;
				LoadInstruction = &LastUse();
				}
				}
				}

				// Removes oldest uses in the value history.
				void DropOldestUses(unsigned NumOfDropped) {
				assert(IsUsed() \|\| IsWaiting());
				assert(NumOfDropped <= TotalUses);
				NumMatDropped += NumOfDropped;
				if (NumOfDropped == TotalUses) {
				State = Free;
				return;
				}
				int ProfitOfDropped = 0;
				for (unsigned I = 0; I < NumOfDropped; ++I)
				ProfitOfDropped += Uses[I].GetProfit();
				Profit -= ProfitOfDropped;
				memmove(Uses, Uses + NumOfDropped, sizeof(RecordedUse) * (TotalUses - 1));
				TotalUses -= NumOfDropped;
				}
				};

				// Keeps track of immediate value usages.
				class ImmHistory {
				public:
				// Iterating through history is iterating through values kept in it.
				typedef RecordedValue *Iterator;
				Iterator begin() { return Immediates; }
				Iterator end() { return Immediates + MaxImmediates; }

				private:
				RecordedValue Immediates[MaxHistoryWidth];
				unsigned TotalTrackedValues;
				unsigned CurrentSN; // Counts 'time' in BB

				// Finds a slot for the new value and initializes it.
				// Returns iterator pointing to the slot.
				Iterator CreateValue(unsigned SN, ImmediateValueUse &Info) {
				assert(TotalTrackedValues <= MaxImmediates);
				Iterator Result = begin();
				for (Iterator E = end(); Result != E; ++Result) {
				if (Result->IsFree())
				break;
				if ((Result->IsUsed() \|\| Result->IsWaiting()) && IsExpired(*Result)) {
				Drop(*Result);
				break;
				}
				}
				assert(Result != end() && "Cannot find free value slot");
				Result->Init(Info.GetValue());
				++TotalTrackedValues;
				return Result;
				}

				public:
				ImmHistory() { Clear(); }

				void Clear() {
				for (Iterator I = begin(), E = end(); I != E; ++I)
				I->MarkFree();
				TotalTrackedValues = 0;
				CurrentSN = 0;
				}

				void Advance() { ++CurrentSN; }

				// Returns 'true' if the value is expired.
				bool IsExpired(RecordedValue &Value) {
				return (CurrentSN - Value.LastUse().GetSN()) >= MaxSeparatingInstrs;
				}

				// Searches history for the specified value.
				//
				// Returns pointer to the value descriptor if the value referenced by the
				// given instruction is found in the history, otherwise returns null
				// pointer. If exact match (value and size) is not found, but history
				// contains a value that could be represented as a subreg of the value
				// sought, returns pointer to that value.
				RecordedValue *FindValue(ImmediateValueUse &Info) {
				for (RecordedValue *I = begin(); I != end(); ++I) {
				if (I->IsFree())
				continue;
				if (I->IsUsed() && IsExpired(*I)) {
				Drop(*I);
				continue;
				}

				// Look for the value that is of the same size or can be truncated to
				// the value looked for.
				if (Info.GetSize() <= I->GetSize()) {
				switch (Info.GetSize()) {
				case DATA_64:
				if (I->GetValue() == Info.GetValue())
				return I;
				break;
				case DATA_32:
				if (I->Get32bits() == Info.Get32bits())
				return I;
				break;
				case DATA_16:
				if (I->Get16bits() == Info.Get16bits())
				return I;
				break;
				case DATA_8:
				if (I->Get8bits() == Info.Get8bits())
				return I;
				break;
				default:
				llvm_unreachable("Invalid data size");
				}
				}

				// Is there a short value that can be extended to the specified?
				if (I->IsInReg()) {
				// Values that are already in a register cannot be extended.
				continue;
				} else if (Info.GetSize() > I->GetSize()) {
				Iterator Found = end();
				switch (I->GetSize()) {
				case DATA_32:
				if (I->Get32bits() == Info.Get32bits())
				Found = I;
				break;
				case DATA_16:
				if (I->Get16bits() == Info.Get16bits())
				Found = I;
				break;
				case DATA_8:
				if (I->Get8bits() == Info.Get8bits())
				Found = I;
				break;
				default:
				llvm_unreachable("Invalid data size");
				}
				if (Found != end())
				return Found;
				}
				}
				return nullptr;
				}

				// Registers the new use of the immediate value.
				RecordedValue &AddUse(RecordedValue *ValPtr, InstrIterator I,
				ImmediateValueUse &Info) {
				if (ValPtr == nullptr) {
				// New immediate value
				ValPtr = CreateValue(CurrentSN, Info);
				ValPtr->NewUse(I, CurrentSN, Info.GetInstrInfo());
				} else {
				// The new use is added to existing value
				if (ValPtr->GetSize() < Info.GetSize()) {
				// The value in history is a subreg of the added one, need to
				// widen it.
				ValPtr->Value = Info.GetValue();
				}
				if (ValPtr->IsInReg()) {
				// If the value is already in register, keep only the last use.
				assert(ValPtr->TotalUses == 1);
				ValPtr->FirstUse().Init(I, CurrentSN, Info.GetInstrInfo());
				} else {
				// Not in register yet
				assert(!ValPtr->IsScheduled() \|\| !ValPtr->IsFull());
				if (ValPtr->IsFull())
				ValPtr->DropOldestUses(1);
				assert(!ValPtr->IsFull());
				ValPtr->NewUse(I, CurrentSN, Info.GetInstrInfo());
				}
				}
				return *ValPtr;
				}

				// Removes the specified value from history.
				void Drop(RecordedValue &ImmV) {
				assert(TotalTrackedValues > 0);
				assert(!ImmV.IsInReg());
				ImmV.MarkFree();
				--TotalTrackedValues;
				}
				};

				// The pass transforms sequences of instructions that use the same immediate
				// value into equivalent sequence that uses that value loaded into a register.
				struct MaterializeImmediates : public MachineFunctionPass {
				static char ID;
				ImmHistory History;
				unsigned TotalUsedRegs;
				const TargetInstrInfo *TII;
				MachineRegisterInfo *MRegInfo;
				bool Mode64Bit;
				const X86RegisterInfo *RegInfo;

				MaterializeImmediates() : MachineFunctionPass(ID) { OpCodeTable::Sort(); }

				virtual bool runOnMachineFunction(MachineFunction &MF);

				virtual const char *getPassName() const {
				return "X86 constant materializer";
				}

				bool CanBeTransformed(MachineInstr &Instr, ImmediateValueUse &Info);
				bool UseCanBeAddedToValue(RecordedValue &ImmV, ImmediateValueUse &Info);
				bool CanMaterialize(RecordedValue &ImmV);
				const TargetRegisterClass *
				GetCommonClass(const TargetRegisterClass *ImmRegClass,
				const TargetRegisterClass *LoadRegClass);
				void ProcessExpiredValue(bool Finish);
				void ScheduleForMaterialization(RecordedValue &ImmV);
				void StartMaterialization(RecordedValue &ImmV);
				void MaterializeValues(RecordedValue &ImmV);
				unsigned FinishMaterialization(RecordedValue &ImmV);
				};

				char MaterializeImmediates::ID = 0;
				}

				bool MaterializeImmediates::runOnMachineFunction(MachineFunction &MF) {
				MRegInfo = &MF.getRegInfo();
				const TargetSubtargetInfo &STI = MF.getSubtarget();
				TII = STI.getInstrInfo();
				Mode64Bit = (STI.getFeatureBits() & X86::Mode64Bit) != 0;
				RegInfo = static_cast<const X86RegisterInfo *>(STI.getRegisterInfo());
				unsigned StartNumMaterialized = NumMatImmediates;

				for (MachineBasicBlock &MBB : MF) {
				History.Clear();
				TotalUsedRegs = 0;
				for (auto MII = MBB.begin(), MIE = MBB.end(); MII != MIE; ++MII) {
				History.Advance();
				// Time is advanced and some values kept in the history may need removal.
				// Do this check if we know that some value need materialization, thus we
				// can speed up the pass. Values that are expired and do not need moving
				// to register are dropped when we look for a slot for a new value.
				if (TotalUsedRegs)
				ProcessExpiredValue(false);

				// Process the current instruction.
				ImmediateValueUse Info;
				if (CanBeTransformed(*MII, Info)) {
				RecordedValue *ValPtr = History.FindValue(Info);
				if (ValPtr && !UseCanBeAddedToValue(*ValPtr, Info))
				continue;
				RecordedValue &ImmV = History.AddUse(ValPtr, MII, Info);
				if (ImmV.IsInReg()) {
				MaterializeValues(ImmV);
				} else if (ImmV.IsScheduled()) {
				if (ImmV.IsFull())
				StartMaterialization(ImmV);
				} else if (CanMaterialize(ImmV)) {
				if (TotalUsedRegs < MaxRegisters)
				ScheduleForMaterialization(ImmV);
				else
				ImmV.MarkWaiting();
				}
				}
				}

				// Finish pending materialization requests.
				while (TotalUsedRegs)
				ProcessExpiredValue(true);
				}
				return NumMatImmediates > StartNumMaterialized;
				}

				// Checks if the the given machine instruction uses immediate value and it is
				// possible to transform it into a form that uses register instead. If so,
				// initializes argument 'Info' with respective information.
				bool MaterializeImmediates::CanBeTransformed(MachineInstr &Instr,
				ImmediateValueUse &Info) {
				InstrInfo *Rec = OpCodeTable::Find(Instr.getOpcode());
				if (Rec == 0)
				qcolombetUnsubmitted Not Done Reply Inline Actions nullptr instead of 0. qcolombet: nullptr instead of 0.
				return false;
				assert(Rec->Opcode == Instr.getOpcode());

				// Skip things like addresses represented by immediate values.
				if (!Instr.getOperand(Rec->OperandNo).isImm())
				return false;

				if (Rec->IsLoad) {
				unsigned LoadedReg = Instr.getOperand(0).getReg();
				if (MRegInfo->hasOneUse(LoadedReg))
				return false;
				}

				Info.Init(Instr.getOperand(Rec->OperandNo).getImm(), Rec);
				return true;
				}

				// If immediate value is already present in history, this method decides, if
				// the instruction specified by argument 'Info' can be added to uses of the
				// value.
				bool MaterializeImmediates::UseCanBeAddedToValue(RecordedValue &ImmV,
				ImmediateValueUse &Info) {
				// If a value is already in a register that requires REX, replacing byte
				// immediates does not give gain in size.
				if (ImmV.IsInReg() && Mode64Bit && ImmV.UsesREX && Info.GetSize() == DATA_8)
				return false;
				return true;
				}

				// Checks if it is profitable to put the immediate value into a register.
				bool MaterializeImmediates::CanMaterialize(RecordedValue &ImmV) {
				assert(!ImmV.IsFree());
				return ImmV.GetProfit() >= MinProfit;
				}

				const TargetRegisterClass *
				MaterializeImmediates::GetCommonClass(const TargetRegisterClass *ImmRegClass,
				const TargetRegisterClass *LoadRegClass) {
				const TargetRegisterClass *CommonRC = nullptr;
				if (ImmRegClass->getSize() == LoadRegClass->getSize()) {
				if (LoadRegClass->hasSuperClassEq(ImmRegClass))
				CommonRC = LoadRegClass;
				else if (LoadRegClass->hasSubClass(ImmRegClass))
				CommonRC = ImmRegClass;
				} else if (LoadRegClass->getSize() < ImmRegClass->getSize()) {
				unsigned SubReg =
				GetSubregByBytes(ImmRegClass->getSize(), LoadRegClass->getSize());
				if (const TargetRegisterClass *MRC = RegInfo->getMatchingSuperRegClass(
				ImmRegClass, LoadRegClass, SubReg)) {
				CommonRC = MRC;
				}
				}
				return CommonRC;
				}

				// Try to find a value that needs to be removed from the history and finish
				// its materialization.
				void MaterializeImmediates::ProcessExpiredValue(bool Finish) {
				// Find a value that should be removed from history.
				unsigned LastInstructionSN = 0; // SN of the last instruction that used reg
				for (auto &Value : History) {
				if (!Value.IsFree() && (Finish \|\| History.IsExpired(Value))) {
				// The value need to be removed from history.
				if (Value.IsScheduled())
				// The value is marked for materialization, but register was not
				// allocated to it yet.
				StartMaterialization(Value);
				if (Value.IsInReg())
				LastInstructionSN = FinishMaterialization(Value);
				if (!Value.IsFree())
				History.Drop(Value);
				if (LastInstructionSN)
				break;
				}
				}

				if (LastInstructionSN == 0) // No value is processed
				return;

				// There may be a value that can be moved to register (as it is profitable)
				// but it has not due to lack of registers. Now a register become available
				// and such value can be cached.
				for (auto &Value : History) {
				if (Value.IsWaiting()) {
				// Scan recorded uses until we found a use that occurs after a register
				// becomes available. All preceding uses are discarded.
				unsigned NumOfDropped = 0;
				for (auto &Use : Value) {
				if (Use.GetSN() < LastInstructionSN)
				++NumOfDropped;
				}
				Value.DropOldestUses(NumOfDropped);
				if (!Value.IsUsed())
				continue;
				if (CanMaterialize(Value)) {
				ScheduleForMaterialization(Value);
				StartMaterialization(Value);
				} else if (Finish)
				History.Drop(Value);
				}
				}
				}

				// Marks the value as ready for loading to a register, but does not change
				// instructions. Actual replacement is postponed until history is full or the
				// value is to be removed from tracking.
				void MaterializeImmediates::ScheduleForMaterialization(RecordedValue &ImmV) {
				ImmV.MarkSchedule();
				assert(TotalUsedRegs < MaxRegisters);
				ImmV.FirstInstruction = ImmV.FirstUse().GetInstr();
				++TotalUsedRegs;
				}

				// Replaces the instructions that use the specified immediate value by their
				// variants that use register. After this call any subsequent instruction that
				// uses the same immediate value is not put into history, but is immediately
				// transformed into the form that uses register.
				void MaterializeImmediates::StartMaterialization(RecordedValue &ImmV) {
				assert(ImmV.IsScheduled());

				// Determine actual immediate size and choose appropriate register class
				// for it.
				bool AlwaysUseREX;
				int LoadInstructionSize = ImmV.GetLoadInstructionSize(AlwaysUseREX);
				bool REXMayBeUsed = AlwaysUseREX;
				const TargetRegisterClass *RegClass;
				if (AlwaysUseREX) {
				// 64-bit value, upper half != 0
				RegClass = &X86::GR64RegClass;
				} else if (ImmV.SizeCnt[DATA_64] != 0) {
				// 64-bit value, loaded as zext'ed 32-bit
				if (ImmV.SizeCnt[DATA_8] == 0) {
				RegClass = &X86::GR64RegClass;
				REXMayBeUsed = true;
				} else {
				// Replacement MOV8mi -> MOV8ri is not profitable if register requires REX
				RegClass = &X86::GR64_NOREXRegClass;
				}
				} else if (ImmV.SizeCnt[DATA_32] != 0) {
				if (!Mode64Bit \|\| ImmV.SizeCnt[DATA_8] == 0) {
				RegClass = &X86::GR32RegClass;
				REXMayBeUsed = true;
				} else {
				RegClass = &X86::GR32_NOREXRegClass;
				}
				} else if (ImmV.SizeCnt[DATA_16] != 0) {
				RegClass = &X86::GR16_NOREXRegClass;
				} else { // 8-bit value
				RegClass = &X86::GR8_NOREXRegClass;
				}

				// If use set of the value contains load instruction, expences for load
				qcolombetUnsubmitted Not Done Reply Inline Actions expences -> expenses. qcolombet: expences -> expenses.
				// instruction can be decreased.
				if (ImmV.HasLoad) {
				MachineInstr &LoadInstr = *ImmV.LoadInstruction->GetInstr();
				unsigned LoadedReg = LoadInstr.getOperand(0).getReg();
				const TargetRegisterClass *LoadedRC = MRegInfo->getRegClass(LoadedReg);
				const TargetRegisterClass *CommonRC = GetCommonClass(RegClass, LoadedRC);
				if (CommonRC) {
				RegClass = CommonRC;
				if (RegClass->getSize() == LoadedRC->getSize())
				LoadInstructionSize = 0;
				else
				LoadInstructionSize -= GetSizeOfMovImmToRegInstr(LoadInstr.getOpcode());
				}
				}

				// Assume the worst case - REX is used if it may be used.
				if (REXMayBeUsed)
				++LoadInstructionSize;
				if (ImmV.Profit <= LoadInstructionSize) {
				assert(TotalUsedRegs > 0);
				--TotalUsedRegs;
				History.Drop(ImmV);
				return;
				}

				// Keep info about allocated register.
				ImmV.MarkInReg(MRegInfo->createVirtualRegister(RegClass), REXMayBeUsed);
				ImmV.Profit -= LoadInstructionSize;

				// Insert load instruction
				int LoadOpCode = ImmV.GetLoadInstructionOpcode();
				const MCInstrDesc &IDescr = TII->get(LoadOpCode);
				InstrIterator FirstInstr = ImmV.FirstInstruction;
				BuildMI(*FirstInstr->getParent(), FirstInstr, FirstInstr->getDebugLoc(),
				IDescr, ImmV.Register).addImm(ImmV.Value);

				// Go through the value uses and replace immediates with register.
				MaterializeValues(ImmV);
				}

				// Replaces the specified use of immediate value with register use.
				void MaterializeImmediates::MaterializeValues(RecordedValue &ImmV) {
				assert(ImmV.IsInReg());

				for (auto &Use : ImmV) {
				MachineInstr &Instr = *Use.GetInstr();
				unsigned SubReg = GetSubreg(ImmV.RegSize, Use.GetSize());

				if (Use.IsLoad()) {
				unsigned LoadedReg = Instr.getOperand(0).getReg();

				// Class of the register used to store immediate must be a subclass
				// of the loaded register class, or its super register class.
				const TargetRegisterClass *LoadedRC = MRegInfo->getRegClass(LoadedReg);
				const TargetRegisterClass *ImmRC = MRegInfo->getRegClass(ImmV.Register);
				const TargetRegisterClass *CommonRC = GetCommonClass(ImmRC, LoadedRC);
				if (!CommonRC)
				continue;
				if (CommonRC != ImmRC)
				MRegInfo->setRegClass(ImmV.Register, CommonRC);

				// Replace all uses of the loaded register with the register used for
				// immediate value.
				unsigned LoadSize = LoadedRC->getSize();
				unsigned ImmSize = SizeInBytes(ImmV.GetSize());
				assert(LoadSize <= ImmSize);
				MRegInfo->replaceRegWith(LoadedReg, ImmV.Register);
				Instr.removeFromParent();
				} else {
				const MCInstrDesc &UseInstr = TII->get(Use.GetNewOpCode());
				Instr.setDesc(UseInstr);
				Instr.getOperand(Use.GetOperand()).ChangeToRegister(ImmV.Register, false);
				Instr.getOperand(Use.GetOperand()).setSubReg(SubReg);
				}
				}
				ImmV.ReplacedCnt += ImmV.TotalUses;
				ImmV.Uses[0] = ImmV.LastUse();
				ImmV.TotalUses = 1;
				}

				// Prepares the immediate value for removing from history.
				unsigned MaterializeImmediates::FinishMaterialization(RecordedValue &Value) {
				assert(Value.IsInReg());
				unsigned LastInstructionSN = Value.LastUse().GetSN();
				Value.MarkFree();
				History.Drop(Value);

				// Update statistics
				++NumMatImmediates;
				NumMatBytes += Value.Profit;
				NumMatInstructions += Value.ReplacedCnt;
				--TotalUsedRegs;

				return LastInstructionSN;
				}

				FunctionPass *llvm::createX86MaterializeImmediates() {
				if (MaxImmediates && MaxImmediates > MaxHistoryWidth)
				MaxImmediates = MaxHistoryWidth;
				if (HistoryDepth && HistoryDepth > MaxHistoryDepth)
				HistoryDepth = MaxHistoryDepth;
				if (MaxSeparatingInstrs && MaxSeparatingInstrs > MaxHistoryWidth)
				MaxSeparatingInstrs = MaxHistoryWidth;
				return new MaterializeImmediates();
				}

lib/Target/X86/X86TargetMachine.cpp

	Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines
	}			}

	bool X86PassConfig::addILPOpts() {			bool X86PassConfig::addILPOpts() {
	addPass(&EarlyIfConverterID);			addPass(&EarlyIfConverterID);
	return true;			return true;
	}			}

	bool X86PassConfig::addPreRegAlloc() {			bool X86PassConfig::addPreRegAlloc() {
	return false; // -print-machineinstr shouldn't print after this.			addPass(createX86MaterializeImmediates());
				return true;
				// return false; // -print-machineinstr shouldn't print after this.
	}			}

	bool X86PassConfig::addPostRegAlloc() {			bool X86PassConfig::addPostRegAlloc() {
	addPass(createX86FloatingPointStackifierPass());			addPass(createX86FloatingPointStackifierPass());
	return true; // -print-machineinstr should print after this.			return true; // -print-machineinstr should print after this.
	}			}

	bool X86PassConfig::addPreEmitPass() {			bool X86PassConfig::addPreEmitPass() {
	Show All 19 Lines

test/CodeGen/X86/coalescer-commute3.ll

	; RUN: llc < %s -mtriple=i686-apple-darwin -mattr=+sse2 \| grep mov \| count 6			; RUN: llc < %s -mtriple=i686-apple-darwin -mattr=+sse2 \| grep mov \| count 8

	%struct.quad_struct = type { i32, i32, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct* }			%struct.quad_struct = type { i32, i32, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct* }

	define i32 @perimeter(%struct.quad_struct* %tree, i32 %size) nounwind {			define i32 @perimeter(%struct.quad_struct* %tree, i32 %size) nounwind {
	entry:			entry:
	switch i32 %size, label %UnifiedReturnBlock [			switch i32 %size, label %UnifiedReturnBlock [
	i32 2, label %bb			i32 2, label %bb
	i32 0, label %bb50			i32 0, label %bb50
	Show All 15 Lines

test/CodeGen/X86/fast-isel-x86.ll

	Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines

	; Check that fast-isel cleans up when it fails to lower a call instruction.			; Check that fast-isel cleans up when it fails to lower a call instruction.
	define void @test5() {			define void @test5() {
	entry:			entry:
	%call = call i32 @test5dllimport(i32 42)			%call = call i32 @test5dllimport(i32 42)
	ret void			ret void
	; CHECK-LABEL: test5:			; CHECK-LABEL: test5:
	; Local value area is still there:			; Local value area is still there:
	; CHECK: movl $42, {{%[a-z]+}}			; CHECK: movl $42, [[IMMREG:%[a-z]+]]
	; Fast-ISel's arg push is not here:			; Fast-ISel's arg push is not here:
	; CHECK-NOT: movl $42, (%esp)			; CHECK-NOT: movl $42, (%esp)
	; SDag-ISel's arg push:			; SDag-ISel's arg push:
	; CHECK: movl %esp, [[REGISTER:%[a-z]+]]			; CHECK: movl %esp, [[REGISTER:%[a-z]+]]
	; CHECK: movl $42, ([[REGISTER]])			; CHECK: movl [[IMMREG]], ([[REGISTER]])
	; CHECK: movl __imp__test5dllimport			; CHECK: movl __imp__test5dllimport
	}			}
	declare dllimport i32 @test5dllimport(i32)			declare dllimport i32 @test5dllimport(i32)

test/CodeGen/X86/materialize-imm.ll

This file was added.

				; RUN: llc < %s -mtriple=i686-pc-linux-gnu \| FileCheck %s
				; RUN: llc < %s -mtriple=x86_64-pc-linux-gnu \| FileCheck %s

				; // 2 byte move replacement gives zero gain
				; void func_01(char* x) {
				; *x = 11;
				; *(x+3) = 11;
				; }
				define void @func_01(i8* nocapture %x) {
				entry:
				store i8 11, i8* %x, align 1
				%add.ptr = getelementptr inbounds i8* %x, i32 3
				store i8 11, i8* %add.ptr, align 1
				ret void
				; CHECK-LABEL: func_01:
				; CHECK: movb $11,
				; CHECK: movb $11,
				; CHECK: ret
				}

				; // But 3 byte is already profitable
				; void func_02(char* x) {
				; *x = 11;
				; *(x+3) = 11;
				; *(x+6) = 11;
				; }
				define void @func_02(i8* nocapture %x) {
				entry:
				store i8 11, i8* %x, align 1
				%add.ptr = getelementptr inbounds i8* %x, i32 3
				store i8 11, i8* %add.ptr, align 1
				%add.ptr1 = getelementptr inbounds i8* %x, i32 6
				store i8 11, i8* %add.ptr1, align 1
				ret void
				; CHECK-LABEL: func_02:
				; CHECK: movb $11, %[[REG:[a-zA-Z0-9]+]]
				; CHECK: movb %[[REG]],
				; CHECK: movb %[[REG]],
				; CHECK: movb %[[REG]],
				; CHECK: ret
				}

				; // 2 32-bit move replacement is profitable
				; void func_03(char* x) {
				; ((int)x) = 0x55555555;
				; ((int)x+3) = 0x55555555;
				; }
				define void @func_03(i8* nocapture %x) {
				entry:
				%0 = bitcast i8* %x to i32*
				store i32 1431655765, i32* %0, align 4
				%add.ptr = getelementptr inbounds i8* %x, i32 12
				%1 = bitcast i8* %add.ptr to i32*
				store i32 1431655765, i32* %1, align 4
				ret void
				; CHECK-LABEL: func_03:
				; CHECK: movl $1431655765, %[[REG:[a-zA-Z0-9]+]]
				; CHECK: movl %[[REG]],
				; CHECK: movl %[[REG]],
				; CHECK: ret
				}

				; // 2 16-bit move replacement gives zero gain
				; void func_04(char* x) {
				; ((short)x) = 0x5555;
				; ((short)x+3) = 0x5555;
				; }
				define void @func_04(i8* nocapture %x) {
				entry:
				%0 = bitcast i8* %x to i16*
				store i16 21845, i16* %0, align 2
				%add.ptr = getelementptr inbounds i8* %x, i32 6
				%1 = bitcast i8* %add.ptr to i16*
				store i16 21845, i16* %1, align 2
				ret void
				; CHECK-LABEL: func_04:
				; CHECK: movw $21845,
				; CHECK: movw $21845,
				; CHECK: ret
				}


				; // but 3 16-bit move replacement is already profitable
				; void func_05(char* x) {
				; ((short)x) = 0x5555;
				; ((short)x+3) = 0x5555;
				; ((short)x+9) = 0x5555;
				; }
				define void @func_05(i8* nocapture %x) {
				entry:
				%0 = bitcast i8* %x to i16*
				store i16 21845, i16* %0, align 2
				%add.ptr = getelementptr inbounds i8* %x, i32 6
				%1 = bitcast i8* %add.ptr to i16*
				store i16 21845, i16* %1, align 2
				%add.ptr1 = getelementptr inbounds i8* %x, i32 18
				%2 = bitcast i8* %add.ptr1 to i16*
				store i16 21845, i16* %2, align 2
				ret void
				; CHECK-LABEL: func_05:
				; CHECK: movw $21845, %[[REG:[a-zA-Z0-9]+]]
				; CHECK: movw %[[REG]],
				; CHECK: movw %[[REG]],
				; CHECK: ret
				}


				; // 116 + 18 is not profitable
				; void func_06(char* x) {
				; *(x) = 0x55;
				; ((short)x+6) = 0x5555;
				; }
				define void @func_06(i8* nocapture %x) {
				entry:
				store i8 85, i8* %x, align 1
				%add.ptr = getelementptr inbounds i8* %x, i32 12
				%0 = bitcast i8* %add.ptr to i16*
				store i16 21845, i16* %0, align 2
				ret void
				; CHECK-LABEL: func_06:
				; CHECK: movb $85,
				; CHECK: movw $21845,
				; CHECK: ret
				}


				; // 116 + 28 gives zero gain
				; void func_06a(char* x) {
				; *(x) = 0x55;
				; *(x+3) = 0x55;
				; ((short)x+9) = 0x5555;
				; }
				define void @func_06a(i8* nocapture %x) {
				entry:
				store i8 85, i8* %x, align 1
				%add.ptr = getelementptr inbounds i8* %x, i32 3
				store i8 85, i8* %add.ptr, align 1
				%add.ptr1 = getelementptr inbounds i8* %x, i32 12
				%0 = bitcast i8* %add.ptr1 to i16*
				store i16 21845, i16* %0, align 2
				ret void
				; CHECK-LABEL: func_06a:
				; CHECK: movb $85,
				; CHECK: movb $85,
				; CHECK: movw $21845,
				; CHECK: ret
				}


				; // 216 + 18 gives one byte gain
				; void func_06b(char* x) {
				; *(x) = 0x55;
				; ((short)x+6) = 0x5555;
				; ((short)x+9) = 0x5555;
				; }
				define void @func_06b(i8* nocapture %x) {
				entry:
				store i8 85, i8* %x, align 1
				%add.ptr = getelementptr inbounds i8* %x, i32 12
				%0 = bitcast i8* %add.ptr to i16*
				store i16 21845, i16* %0, align 2
				%add.ptr1 = getelementptr inbounds i8* %x, i32 18
				%1 = bitcast i8* %add.ptr1 to i16*
				store i16 21845, i16* %1, align 2
				ret void
				; CHECK-LABEL: func_06b:
				; CHECK: movw $21845, %[[REG:[a-z]]]x
				; CHECK: movb %[[REG]]l,
				; CHECK: movw %[[REG]]x,
				; CHECK: movw %[[REG]]x,
				; CHECK: ret
				}


				; // 216 + 28 is profitable
				; void func_07(char* x) {
				; *(x) = 0x55;
				; *(x+3) = 0x55;
				; ((short)x+6) = 0x5555;
				; ((short)x+9) = 0x5555;
				; }
				define void @func_07(i8* nocapture %x) #0 {
				entry:
				store i8 85, i8* %x, align 1
				%add.ptr = getelementptr inbounds i8* %x, i32 3
				store i8 85, i8* %add.ptr, align 1
				%add.ptr1 = getelementptr inbounds i8* %x, i32 12
				%0 = bitcast i8* %add.ptr1 to i16*
				store i16 21845, i16* %0, align 2
				%add.ptr2 = getelementptr inbounds i8* %x, i32 18
				%1 = bitcast i8* %add.ptr2 to i16*
				store i16 21845, i16* %1, align 2
				ret void
				; CHECK-LABEL: func_07:
				; CHECK: movw $21845, %[[REG:[a-z]]]x
				; CHECK: movb %[[REG]]l,
				; CHECK: movb %[[REG]]l,
				; CHECK: movw %[[REG]]x,
				; CHECK: movw %[[REG]]x,
				; CHECK: ret
				}

				; void func_08(char* x) {
				; *(x) = 0x55;
				; *(x+3) = 0x55;
				; ((short)x+6) = 0x5555;
				; ((int)x+9) = 0x55555555;
				; }
				define void @func_08(i8* nocapture %x) {
				entry:
				store i8 85, i8* %x, align 1
				%add.ptr = getelementptr inbounds i8* %x, i32 3
				store i8 85, i8* %add.ptr, align 1
				%add.ptr1 = getelementptr inbounds i8* %x, i32 12
				%0 = bitcast i8* %add.ptr1 to i16*
				store i16 21845, i16* %0, align 2
				%add.ptr2 = getelementptr inbounds i8* %x, i32 36
				%1 = bitcast i8* %add.ptr2 to i32*
				store i32 1431655765, i32* %1, align 4
				ret void
				; CHECK-LABEL: func_08:
				; CHECK: movl $1431655765, %e[[REG:[a-z]]]x
				; CHECK: movb %[[REG]]l,
				; CHECK: movb %[[REG]]l,
				; CHECK: movw %[[REG]]x,
				; CHECK: movl %e[[REG]]x,
				; CHECK: ret
				}

				; void func_09(char* x) {
				; ((int)x+9) = 0x55555555;
				; ((short)x+6) = 0x5555;
				; *(x) = 0x55;
				; *(x+3) = 0x55;
				; }
				define void @func_09(i8* nocapture %x) {
				entry:
				%add.ptr = getelementptr inbounds i8* %x, i32 36
				%0 = bitcast i8* %add.ptr to i32*
				store i32 1431655765, i32* %0, align 4
				%add.ptr1 = getelementptr inbounds i8* %x, i32 12
				%1 = bitcast i8* %add.ptr1 to i16*
				store i16 21845, i16* %1, align 2
				store i8 85, i8* %x, align 1
				%add.ptr2 = getelementptr inbounds i8* %x, i32 3
				store i8 85, i8* %add.ptr2, align 1
				ret void
				; CHECK-LABEL: func_09:
				; CHECK: movl $1431655765, %e[[REG:[a-z]]]x
				; CHECK: movl %e[[REG]]x,
				; CHECK: movw %[[REG]]x,
				; CHECK: movb %[[REG]]l,
				; CHECK: movb %[[REG]]l,
				; CHECK: ret
				}

				; void func_10(char* x) {
				; ((short)x+6) = 0x5555;
				; ((int)x+9) = 0x55555555;
				; *(x) = 0x55;
				; *(x+3) = 0x55;
				; }
				define void @func_10(i8* nocapture %x) {
				entry:
				%add.ptr = getelementptr inbounds i8* %x, i32 12
				%0 = bitcast i8* %add.ptr to i16*
				store i16 21845, i16* %0, align 2
				%add.ptr1 = getelementptr inbounds i8* %x, i32 36
				%1 = bitcast i8* %add.ptr1 to i32*
				store i32 1431655765, i32* %1, align 4
				store i8 85, i8* %x, align 1
				%add.ptr2 = getelementptr inbounds i8* %x, i32 3
				store i8 85, i8* %add.ptr2, align 1
				ret void
				; CHECK-LABEL: func_10:
				; CHECK: movl $1431655765, %e[[REG:[a-z]]]x
				; CHECK: movw %[[REG]]x,
				; CHECK: movl %e[[REG]]x,
				; CHECK: movb %[[REG]]l,
				; CHECK: movb %[[REG]]l,
				; CHECK: ret
				}

				; void func_11(char* x) {
				; ((short)x+6) = 0x5555;
				; *(x) = 0x55;
				; ((int)x+9) = 0x55555555;
				; *(x+3) = 0x55;
				; }
				define void @func_11(i8* nocapture %x) {
				entry:
				%add.ptr = getelementptr inbounds i8* %x, i32 12
				%0 = bitcast i8* %add.ptr to i16*
				store i16 21845, i16* %0, align 2
				store i8 85, i8* %x, align 1
				%add.ptr1 = getelementptr inbounds i8* %x, i32 36
				%1 = bitcast i8* %add.ptr1 to i32*
				store i32 1431655765, i32* %1, align 4
				%add.ptr2 = getelementptr inbounds i8* %x, i32 3
				store i8 85, i8* %add.ptr2, align 1
				ret void
				; CHECK-LABEL: func_11:
				; CHECK: movl $1431655765, %e[[REG:[a-z]]]x
				; CHECK: movw %[[REG]]x,
				; CHECK: movb %[[REG]]l,
				; CHECK: movl %e[[REG]]x,
				; CHECK: movb %[[REG]]l,
				; CHECK: ret
				}

				; void func_12(char* x) {
				; ((int)x) = 0x55555555;
				; ((int)x+2) = 0x55555555;
				; ((int)x+5) = 0x33333333;
				; ((int)x+7) = 0x33333333;
				; }
				define void @func_12(i8* nocapture %x) {
				entry:
				%0 = bitcast i8* %x to i32*
				store i32 1431655765, i32* %0, align 4
				%add.ptr = getelementptr inbounds i8* %x, i32 8
				%1 = bitcast i8* %add.ptr to i32*
				store i32 1431655765, i32* %1, align 4
				%add.ptr1 = getelementptr inbounds i8* %x, i32 20
				%2 = bitcast i8* %add.ptr1 to i32*
				store i32 858993459, i32* %2, align 4
				%add.ptr2 = getelementptr inbounds i8* %x, i32 28
				%3 = bitcast i8* %add.ptr2 to i32*
				store i32 858993459, i32* %3, align 4
				ret void
				; CHECK-LABEL: func_12:
				; CHECK: movl $1431655765, %e[[REG1:[a-z]]]x
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl $858993459, %e[[REG2:[a-z]]]x
				; CHECK: movl %e[[REG2]]x,
				; CHECK: movl %e[[REG2]]x,
				; CHECK: ret
				}

				; void func_13(char* x) {
				; ((int)x) = 0x55555555;
				; ((int)x+2) = 0x55555555;
				; ((int)x+5) = 0x33333333;
				; ((int)x+7) = 0x33333333;
				; ((int)x+9) = 0x11111111;
				; ((int)x+11) = 0x11111111;
				; ((int)x+13) = 0x55555555;
				; }
				define void @func_13(i8* nocapture %x) {
				entry:
				%0 = bitcast i8* %x to i32*
				store i32 1431655765, i32* %0, align 4
				%add.ptr = getelementptr inbounds i8* %x, i32 8
				%1 = bitcast i8* %add.ptr to i32*
				store i32 1431655765, i32* %1, align 4
				%add.ptr1 = getelementptr inbounds i8* %x, i32 20
				%2 = bitcast i8* %add.ptr1 to i32*
				store i32 858993459, i32* %2, align 4
				%add.ptr2 = getelementptr inbounds i8* %x, i32 28
				%3 = bitcast i8* %add.ptr2 to i32*
				store i32 858993459, i32* %3, align 4
				%add.ptr3 = getelementptr inbounds i8* %x, i32 36
				%4 = bitcast i8* %add.ptr3 to i32*
				store i32 286331153, i32* %4, align 4
				%add.ptr4 = getelementptr inbounds i8* %x, i32 44
				%5 = bitcast i8* %add.ptr4 to i32*
				store i32 286331153, i32* %5, align 4
				%add.ptr5 = getelementptr inbounds i8* %x, i32 52
				%6 = bitcast i8* %add.ptr5 to i32*
				store i32 1431655765, i32* %6, align 4
				ret void
				; CHECK-LABEL: func_13:
				; CHECK: movl $1431655765, %e[[REG1:[a-z]]]x
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl $858993459, %e[[REG2:[a-z]]]x
				; CHECK: movl %e[[REG2]]x,
				; CHECK: movl %e[[REG2]]x,
				; CHECK: movl $286331153, %e[[REG3:[a-z]]]x
				; CHECK: movl %e[[REG3]]x,
				; CHECK: movl %e[[REG3]]x,
				; CHECK: movl $1431655765,
				; CHECK: ret
				}

				;void func_14(char* x) {
				; ((int)x) = 0x55555555;
				; ((int)x+2) = 0x55555555;
				; ((int)x+3) = 0x55555555;
				; ((int)x+15) = 0x55555555;
				; ((int)x+13) = 0x55555555;
				; ((int)x+6) = 0x55555555;
				; ((int)x+7) = 0x55555555;
				; ((int)x+9) = 0x55555555;
				; ((int)x+11) = 0x55555555;
				; ((int)x+12) = 0x55555555;
				;}
				define void @func_14(i8* nocapture %x) {
				entry:
				%0 = bitcast i8* %x to i32*
				store i32 1431655765, i32* %0, align 4
				%add.ptr = getelementptr inbounds i8* %x, i32 8
				%1 = bitcast i8* %add.ptr to i32*
				store i32 1431655765, i32* %1, align 4
				%add.ptr1 = getelementptr inbounds i8* %x, i32 12
				%2 = bitcast i8* %add.ptr1 to i32*
				store i32 1431655765, i32* %2, align 4
				%add.ptr2 = getelementptr inbounds i8* %x, i32 60
				%3 = bitcast i8* %add.ptr2 to i32*
				store i32 1431655765, i32* %3, align 4
				%add.ptr3 = getelementptr inbounds i8* %x, i32 52
				%4 = bitcast i8* %add.ptr3 to i32*
				store i32 1431655765, i32* %4, align 4
				%add.ptr4 = getelementptr inbounds i8* %x, i32 24
				%5 = bitcast i8* %add.ptr4 to i32*
				store i32 1431655765, i32* %5, align 4
				%add.ptr5 = getelementptr inbounds i8* %x, i32 28
				%6 = bitcast i8* %add.ptr5 to i32*
				store i32 1431655765, i32* %6, align 4
				%add.ptr6 = getelementptr inbounds i8* %x, i32 36
				%7 = bitcast i8* %add.ptr6 to i32*
				store i32 1431655765, i32* %7, align 4
				%add.ptr7 = getelementptr inbounds i8* %x, i32 44
				%8 = bitcast i8* %add.ptr7 to i32*
				store i32 1431655765, i32* %8, align 4
				%add.ptr8 = getelementptr inbounds i8* %x, i32 48
				%9 = bitcast i8* %add.ptr8 to i32*
				store i32 1431655765, i32* %9, align 4
				ret void
				; CHECK-LABEL: func_14:
				; CHECK: movl $1431655765, %e[[REG1:[a-z]]]x
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: movl %e[[REG1]]x,
				; CHECK: ret
				}

test/CodeGen/X86/memcpy-2.ll

Show All 20 Lines
; SSE2-Mingw32: movsd %xmm0, 16(%esp)		; SSE2-Mingw32: movsd %xmm0, 16(%esp)
; SSE2-Mingw32: movaps _.str, %xmm0		; SSE2-Mingw32: movaps _.str, %xmm0
; SSE2-Mingw32: movups %xmm0		; SSE2-Mingw32: movups %xmm0
; SSE2-Mingw32: movb $0, 24(%esp)		; SSE2-Mingw32: movb $0, 24(%esp)

; SSE1-LABEL: t1:		; SSE1-LABEL: t1:
; SSE1: movaps _.str, %xmm0		; SSE1: movaps _.str, %xmm0
; SSE1: movaps %xmm0		; SSE1: movaps %xmm0
; SSE1: movb $0, 24(%esp)		; SSE1: movl $0, %e[[REGISTER:[a-d]]]x
; SSE1: movl $0, 20(%esp)		; SSE1: movb %[[REGISTER]]l, 24(%esp)
; SSE1: movl $0, 16(%esp)		; SSE1: movl %e[[REGISTER]]x, 20(%esp)
		; SSE1: movl %e[[REGISTER]]x, 16(%esp)

; NOSSE-LABEL: t1:		; NOSSE-LABEL: t1:
; NOSSE: movb $0		; NOSSE: movl $0, %e[[REGISTER:[a-d]]]x
; NOSSE: movl $0		; NOSSE: movb %[[REGISTER]]l,
; NOSSE: movl $0		; NOSSE: movl %e[[REGISTER]]x,
; NOSSE: movl $0		; NOSSE: movl %e[[REGISTER]]x,
; NOSSE: movl $0		; NOSSE: movl %e[[REGISTER]]x,
		; NOSSE: movl %e[[REGISTER]]x,
; NOSSE: movl $101		; NOSSE: movl $101
; NOSSE: movl $1734438249		; NOSSE: movl $1734438249

; X86-64-LABEL: t1:		; X86-64-LABEL: t1:
; X86-64: movaps _.str(%rip), %xmm0		; X86-64: movaps _.str(%rip), %xmm0
; X86-64: movaps %xmm0		; X86-64: movaps %xmm0
; X86-64: movb $0		; X86-64: movb $0
; X86-64: movq $0		; X86-64: movq $0
▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines	; X86-64: movq %rax, (%rdi)
tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %tmp2, i8* %tmp3, i32 16, i32 8, i1 false)		tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %tmp2, i8* %tmp3, i32 16, i32 8, i1 false)
ret void		ret void
}		}

define void @t4() nounwind {		define void @t4() nounwind {
entry:		entry:
; SSE2-Darwin-LABEL: t4:		; SSE2-Darwin-LABEL: t4:
; SSE2-Darwin: movw $120		; SSE2-Darwin: movw $120
; SSE2-Darwin: movl $2021161080		; SSE2-Darwin: movl $2021161080, [[REGISTER:%e[a-d]x]]
; SSE2-Darwin: movl $2021161080		; SSE2-Darwin: movl [[REGISTER]],
; SSE2-Darwin: movl $2021161080		; SSE2-Darwin: movl [[REGISTER]],
; SSE2-Darwin: movl $2021161080		; SSE2-Darwin: movl [[REGISTER]],
; SSE2-Darwin: movl $2021161080		; SSE2-Darwin: movl [[REGISTER]],
; SSE2-Darwin: movl $2021161080		; SSE2-Darwin: movl [[REGISTER]],
; SSE2-Darwin: movl $2021161080		; SSE2-Darwin: movl [[REGISTER]],
		; SSE2-Darwin: movl [[REGISTER]],

; SSE2-Mingw32-LABEL: t4:		; SSE2-Mingw32-LABEL: t4:
; SSE2-Mingw32: movw $120		; SSE2-Mingw32: movw $120
; SSE2-Mingw32: movl $2021161080		; SSE2-Mingw32: movl $2021161080, [[REGISTER:%e[a-d]x]]
; SSE2-Mingw32: movl $2021161080		; SSE2-Mingw32: movl [[REGISTER]],
; SSE2-Mingw32: movl $2021161080		; SSE2-Mingw32: movl [[REGISTER]],
; SSE2-Mingw32: movl $2021161080		; SSE2-Mingw32: movl [[REGISTER]],
; SSE2-Mingw32: movl $2021161080		; SSE2-Mingw32: movl [[REGISTER]],
; SSE2-Mingw32: movl $2021161080		; SSE2-Mingw32: movl [[REGISTER]],
; SSE2-Mingw32: movl $2021161080		; SSE2-Mingw32: movl [[REGISTER]],
		; SSE2-Mingw32: movl [[REGISTER]],

; SSE1-LABEL: t4:		; SSE1-LABEL: t4:
; SSE1: movw $120		; SSE1: movw $120
; SSE1: movl $2021161080		; SSE1: movl $2021161080, [[REGISTER:%e[a-d]x]]
; SSE1: movl $2021161080		; SSE1: movl [[REGISTER]],
; SSE1: movl $2021161080		; SSE1: movl [[REGISTER]],
; SSE1: movl $2021161080		; SSE1: movl [[REGISTER]],
; SSE1: movl $2021161080		; SSE1: movl [[REGISTER]],
; SSE1: movl $2021161080		; SSE1: movl [[REGISTER]],
; SSE1: movl $2021161080		; SSE1: movl [[REGISTER]],
		; SSE1: movl [[REGISTER]],

; NOSSE-LABEL: t4:		; NOSSE-LABEL: t4:
; NOSSE: movw $120		; NOSSE: movw $120
; NOSSE: movl $2021161080		; NOSSE: movl $2021161080, [[REGISTER:%e[a-d]x]]
; NOSSE: movl $2021161080		; NOSSE: movl [[REGISTER]],
; NOSSE: movl $2021161080		; NOSSE: movl [[REGISTER]],
; NOSSE: movl $2021161080		; NOSSE: movl [[REGISTER]],
; NOSSE: movl $2021161080		; NOSSE: movl [[REGISTER]],
; NOSSE: movl $2021161080		; NOSSE: movl [[REGISTER]],
; NOSSE: movl $2021161080		; NOSSE: movl [[REGISTER]],

; X86-64-LABEL: t4:		; X86-64-LABEL: t4:
; X86-64: movabsq $8680820740569200760, %rax		; X86-64: movabsq $8680820740569200760, %rax
; X86-64: movq %rax		; X86-64: movq %rax
; X86-64: movq %rax		; X86-64: movq %rax
; X86-64: movq %rax		; X86-64: movq %rax
; X86-64: movw $120		; X86-64: movw $120
; X86-64: movl $2021161080		; X86-64: movl $2021161080
%tmp1 = alloca [30 x i8]		%tmp1 = alloca [30 x i8]
%tmp2 = bitcast [30 x i8]* %tmp1 to i8*		%tmp2 = bitcast [30 x i8]* %tmp1 to i8*
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %tmp2, i8* getelementptr inbounds ([30 x i8]* @.str2, i32 0, i32 0), i32 30, i32 1, i1 false)		call void @llvm.memcpy.p0i8.p0i8.i32(i8* %tmp2, i8* getelementptr inbounds ([30 x i8]* @.str2, i32 0, i32 0), i32 30, i32 1, i1 false)
unreachable		unreachable
}		}

declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32, i1) nounwind		declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32, i1) nounwind

test/CodeGen/X86/memset.ll

	; RUN: llc < %s -march=x86 -mcpu=pentium2 -mtriple=i686-apple-darwin8.8.0 \| FileCheck %s --check-prefix=X86			; RUN: llc < %s -march=x86 -mcpu=pentium2 -mtriple=i686-apple-darwin8.8.0 \| FileCheck %s --check-prefix=X86
	; RUN: llc < %s -march=x86 -mcpu=pentium3 -mtriple=i686-apple-darwin8.8.0 \| FileCheck %s --check-prefix=XMM			; RUN: llc < %s -march=x86 -mcpu=pentium3 -mtriple=i686-apple-darwin8.8.0 \| FileCheck %s --check-prefix=XMM
	; RUN: llc < %s -march=x86 -mcpu=bdver1 -mtriple=i686-apple-darwin8.8.0 \| FileCheck %s --check-prefix=YMM			; RUN: llc < %s -march=x86 -mcpu=bdver1 -mtriple=i686-apple-darwin8.8.0 \| FileCheck %s --check-prefix=YMM

	%struct.x = type { i16, i16 }			%struct.x = type { i16, i16 }

	define void @t() nounwind {			define void @t() nounwind {
	entry:			entry:
	%up_mvd = alloca [8 x %struct.x] ; <[8 x %struct.x]*> [#uses=2]			%up_mvd = alloca [8 x %struct.x] ; <[8 x %struct.x]*> [#uses=2]
	%up_mvd116 = getelementptr [8 x %struct.x]* %up_mvd, i32 0, i32 0 ; <%struct.x*> [#uses=1]			%up_mvd116 = getelementptr [8 x %struct.x]* %up_mvd, i32 0, i32 0 ; <%struct.x*> [#uses=1]
	%tmp110117 = bitcast [8 x %struct.x]* %up_mvd to i8* ; <i8*> [#uses=1]			%tmp110117 = bitcast [8 x %struct.x]* %up_mvd to i8* ; <i8*> [#uses=1]

	call void @llvm.memset.p0i8.i64(i8* %tmp110117, i8 0, i64 32, i32 8, i1 false)			call void @llvm.memset.p0i8.i64(i8* %tmp110117, i8 0, i64 32, i32 8, i1 false)
	; X86: movl $0,			; X86: movl $0, [[REGISTER:%[a-z]+]]
	; X86: movl $0,			; X86: movl [[REGISTER]],
	; X86: movl $0,			; X86: movl [[REGISTER]],
	; X86: movl $0,			; X86: movl [[REGISTER]],
	; X86: movl $0,			; X86: movl [[REGISTER]],
	; X86: movl $0,			; X86: movl [[REGISTER]],
	; X86: movl $0,			; X86: movl [[REGISTER]],
	; X86: movl $0,			; X86: movl [[REGISTER]],
	; X86-NOT: movl $0,			; X86: movl [[REGISTER]],
	; X86: ret			; X86: ret

	; XMM: xorps %xmm{{[0-9]+}}, [[Z:%xmm[0-9]+]]			; XMM: xorps %xmm{{[0-9]+}}, [[Z:%xmm[0-9]+]]
	; XMM: movaps [[Z]],			; XMM: movaps [[Z]],
	; XMM: movaps [[Z]],			; XMM: movaps [[Z]],
	; XMM-NOT: movaps			; XMM-NOT: movaps
	; XMM: ret			; XMM: ret

	Show All 9 Lines
	declare void @foo(%struct.x*)			declare void @foo(%struct.x*)

	declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1) nounwind			declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1) nounwind

	define void @PR15348(i8* %a) {			define void @PR15348(i8* %a) {
	; Ensure that alignment of '0' in an @llvm.memset intrinsic results in			; Ensure that alignment of '0' in an @llvm.memset intrinsic results in
	; unaligned loads and stores.			; unaligned loads and stores.
	; XMM: PR15348			; XMM: PR15348
	; XMM: movb $0,			; XMM: movl $0, %e[[REGISTER:[a-d]]]x
	; XMM: movl $0,			; XMM: movb %[[REGISTER]]l,
	; XMM: movl $0,			; XMM: movl %e[[REGISTER]]x,
	; XMM: movl $0,			; XMM: movl %e[[REGISTER]]x,
	; XMM: movl $0,			; XMM: movl %e[[REGISTER]]x,
				; XMM: movl %e[[REGISTER]]x,
	call void @llvm.memset.p0i8.i64(i8* %a, i8 0, i64 17, i32 0, i1 false)			call void @llvm.memset.p0i8.i64(i8* %a, i8 0, i64 17, i32 0, i1 false)
	ret void			ret void
	}			}

test/CodeGen/X86/memset64-on-x86-32.ll

	; RUN: llc < %s -mtriple=i386-apple-darwin -mcpu=nehalem \| grep movups \| count 5			; RUN: llc < %s -mtriple=i386-apple-darwin -mcpu=nehalem \| grep movups \| count 5
	; RUN: llc < %s -mtriple=i386-apple-darwin -mcpu=core2 \| grep movl \| count 20			; RUN: llc < %s -mtriple=i386-apple-darwin -mcpu=core2 \| grep movl \| count 21
	; RUN: llc < %s -mtriple=i386-pc-mingw32 -mcpu=core2 \| grep movl \| count 20			; RUN: llc < %s -mtriple=i386-pc-mingw32 -mcpu=core2 \| grep movl \| count 21
	; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=core2 \| grep movq \| count 10			; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=core2 \| grep movq \| count 10

	define void @bork() nounwind {			define void @bork() nounwind {
	entry:			entry:
	call void @llvm.memset.p0i8.i64(i8* null, i8 0, i64 80, i32 4, i1 false)			call void @llvm.memset.p0i8.i64(i8* null, i8 0, i64 80, i32 4, i1 false)
	ret void			ret void
	}			}

	declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1) nounwind			declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1) nounwind

test/CodeGen/X86/nancvt.ll

	; RUN: opt < %s -std-compile-opts \| llc > %t			; RUN: opt < %s -std-compile-opts \| llc > %t
	; RUN: grep 2147027116 %t \| count 3			; RUN: grep 2147027116 %t \| count 1
	; RUN: grep 2147228864 %t \| count 3			; RUN: grep 2147228864 %t \| count 1
	; RUN: grep 2146502828 %t \| count 3			; RUN: grep 2146502828 %t \| count 1
	; RUN: grep 2143034560 %t \| count 3			; RUN: grep 2143034560 %t \| count 1
	; Compile time conversions of NaNs.			; Compile time conversions of NaNs.
	; ModuleID = 'nan2.c'			; ModuleID = 'nan2.c'
	target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"			target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"
	target triple = "i686-apple-darwin8"			target triple = "i686-apple-darwin8"
	%struct..0anon = type { float }			%struct..0anon = type { float }
	%struct..1anon = type { double }			%struct..1anon = type { double }
	@fnan = constant [3 x i32] [ i32 2143831397, i32 2143831396, i32 2143831398 ] ; <[3 x i32]*> [#uses=1]			@fnan = constant [3 x i32] [ i32 2143831397, i32 2143831396, i32 2143831398 ] ; <[3 x i32]*> [#uses=1]
	@dnan = constant [3 x i64] [ i64 9223235251041752696, i64 9223235251041752697, i64 9223235250773317239 ], align 8 ; <[3 x i64]*> [#uses=1]			@dnan = constant [3 x i64] [ i64 9223235251041752696, i64 9223235251041752697, i64 9223235250773317239 ], align 8 ; <[3 x i64]*> [#uses=1]
	▲ Show 20 Lines • Show All 170 Lines • Show Last 20 Lines

test/CodeGen/X86/pr14562.ll

	; RUN: llc < %s -march=x86 \| FileCheck %s			; RUN: llc < %s -march=x86 \| FileCheck %s

	@temp1 = global i64 -77129852189294865, align 8			@temp1 = global i64 -77129852189294865, align 8

	define void @foo() nounwind {			define void @foo() nounwind {
	%x = load i64* @temp1, align 8			%x = load i64* @temp1, align 8
	%s = shl i64 %x, 32			%s = shl i64 %x, 32
	%t = trunc i64 %s to i32			%t = trunc i64 %s to i32
	%z = zext i32 %t to i64			%z = zext i32 %t to i64
	store i64 %z, i64* @temp1, align 8			store i64 %z, i64* @temp1, align 8
	; CHECK: movl $0, {{_?}}temp1+4			; CHECK: movl $0, [[REGISTER:%e[a-d]x]]
	; CHECK: movl $0, {{_?}}temp1			; CHECK: movl [[REGISTER]], {{_?}}temp1+4
				; CHECK: movl [[REGISTER]], {{_?}}temp1
	ret void			ret void
	}			}

test/CodeGen/X86/pr18023.ll

	; RUN: llc < %s -mtriple x86_64-apple-macosx10.9.0 \| FileCheck %s			; RUN: llc < %s -mtriple x86_64-apple-macosx10.9.0 \| FileCheck %s
	; PR18023			; PR18023

	; CHECK: movabsq $4294967296, %rcx			; CHECK: movabsq $4294967296, %rcx
	; CHECK: movq %rcx, (%rax)			; CHECK: movq %rcx, (%rax)
	; CHECK: movl $1, 4(%rax)			; CHECK: movl $1, %r[[REG:[a-z]]]x
				; CHECK: movl %e[[REG]]x, 4(%rax)
	; CHECK: movl $0, 4(%rax)			; CHECK: movl $0, 4(%rax)
	; CHECK: movq $1, 4(%rax)			; CHECK: movq %r[[REG]]x, 4(%rax)

	@c = common global i32 0, align 4			@c = common global i32 0, align 4
	@a = common global [3 x i32] zeroinitializer, align 4			@a = common global [3 x i32] zeroinitializer, align 4
	@b = common global i32 0, align 4			@b = common global i32 0, align 4
	@.str = private unnamed_addr constant [4 x i8] c"%d\0A\00", align 1			@.str = private unnamed_addr constant [4 x i8] c"%d\0A\00", align 1

	define void @func() {			define void @func() {
	store i32 1, i32* getelementptr inbounds ([3 x i32]* @a, i64 0, i64 1), align 4			store i32 1, i32* getelementptr inbounds ([3 x i32]* @a, i64 0, i64 1), align 4
	Show All 15 Lines

test/CodeGen/X86/tlv-1.ll

	; RUN: llc < %s -mtriple x86_64-apple-darwin -mcpu=core2 \| FileCheck %s			; RUN: llc < %s -mtriple x86_64-apple-darwin -mcpu=core2 \| FileCheck %s

	%struct.A = type { [48 x i8], i32, i32, i32 }			%struct.A = type { [48 x i8], i32, i32, i32 }

	@c = external thread_local global %struct.A, align 4			@c = external thread_local global %struct.A, align 4

	define void @main() nounwind ssp {			define void @main() nounwind ssp {
	; CHECK-LABEL: main:			; CHECK-LABEL: main:
	entry:			entry:
	call void @llvm.memset.p0i8.i64(i8* getelementptr inbounds (%struct.A* @c, i32 0, i32 0, i32 0), i8 0, i64 60, i32 1, i1 false)			call void @llvm.memset.p0i8.i64(i8* getelementptr inbounds (%struct.A* @c, i32 0, i32 0, i32 0), i8 0, i64 60, i32 1, i1 false)
	unreachable			unreachable
	; CHECK: movq _c@TLVP(%rip), %rdi			; CHECK: movq _c@TLVP(%rip), %rdi
	; CHECK-NEXT: callq *(%rdi)			; CHECK-NEXT: callq *(%rdi)
	; CHECK-NEXT: movl $0, 56(%rax)			; CHECK-NEXT: movl $0, %r[[REGISTER:[a-z]+]]
	; CHECK-NEXT: movq $0, 48(%rax)			; CHECK-NEXT: movl %e[[REGISTER]], 56(%rax)
				; CHECK-NEXT: movq %r[[REGISTER]], 48(%rax)
	}			}

	; rdar://10291355			; rdar://10291355
	define i32 @test() nounwind readonly ssp {			define i32 @test() nounwind readonly ssp {
	entry:			entry:
	; CHECK-LABEL: test:			; CHECK-LABEL: test:
	; CHECK: movq _a@TLVP(%rip),			; CHECK: movq _a@TLVP(%rip),
	; CHECK: callq *			; CHECK: callq *
	Show All 29 Lines

test/CodeGen/X86/xmulo.ll

	; RUN: llc %s -o - \| FileCheck %s			; RUN: llc %s -o - \| FileCheck %s
	target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128-n8:16:32-S128"			target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128-n8:16:32-S128"
	target triple = "i386-apple-macosx10.8.0"			target triple = "i386-apple-macosx10.8.0"

	declare {i64, i1} @llvm.umul.with.overflow.i64(i64, i64) nounwind readnone			declare {i64, i1} @llvm.umul.with.overflow.i64(i64, i64) nounwind readnone
	declare i32 @printf(i8*, ...)			declare i32 @printf(i8*, ...)

	@.str = private unnamed_addr constant [10 x i8] c"%llx, %d\0A\00", align 1			@.str = private unnamed_addr constant [10 x i8] c"%llx, %d\0A\00", align 1

	define i32 @t1() nounwind {			define i32 @t1() nounwind {
	; CHECK-LABEL: t1:			; CHECK-LABEL: t1:
	; CHECK: movl $0, 12(%esp)			; CHECK: movl $0, [[REGISTER:%[a-z]+]]
	; CHECK: movl $0, 8(%esp)			; CHECK: movl [[REGISTER]], 12(%esp)
				; CHECK: movl [[REGISTER]], 8(%esp)
	; CHECK: movl $72, 4(%esp)			; CHECK: movl $72, 4(%esp)

	%1 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 9, i64 8)			%1 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 9, i64 8)
	%2 = extractvalue {i64, i1} %1, 0			%2 = extractvalue {i64, i1} %1, 0
	%3 = extractvalue {i64, i1} %1, 1			%3 = extractvalue {i64, i1} %1, 1
	%4 = zext i1 %3 to i32			%4 = zext i1 %3 to i32
	%5 = call i32 (i8, ...) @printf(i8* getelementptr inbounds ([10 x i8]* @.str, i32 0, i32 0), i64 %2, i32 %4)			%5 = call i32 (i8, ...) @printf(i8* getelementptr inbounds ([10 x i8]* @.str, i32 0, i32 0), i64 %2, i32 %4)
	ret i32 0			ret i32 0
	}			}

	define i32 @t2() nounwind {			define i32 @t2() nounwind {
	; CHECK-LABEL: t2:			; CHECK-LABEL: t2:
	; CHECK: movl $0, 12(%esp)			; CHECK: movl $0, [[REGISTER:%[a-z]+]]
	; CHECK: movl $0, 8(%esp)			; CHECK: movl [[REGISTER]], 12(%esp)
	; CHECK: movl $0, 4(%esp)			; CHECK: movl [[REGISTER]], 8(%esp)
				; CHECK: movl [[REGISTER]], 4(%esp)

	%1 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 9, i64 0)			%1 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 9, i64 0)
	%2 = extractvalue {i64, i1} %1, 0			%2 = extractvalue {i64, i1} %1, 0
	%3 = extractvalue {i64, i1} %1, 1			%3 = extractvalue {i64, i1} %1, 1
	%4 = zext i1 %3 to i32			%4 = zext i1 %3 to i32
	%5 = call i32 (i8, ...) @printf(i8* getelementptr inbounds ([10 x i8]* @.str, i32 0, i32 0), i64 %2, i32 %4)			%5 = call i32 (i8, ...) @printf(i8* getelementptr inbounds ([10 x i8]* @.str, i32 0, i32 0), i64 %2, i32 %4)
	ret i32 0			ret i32 0
	}			}
	Show All 14 Lines

test/DebugInfo/X86/debug-loc-offset.ll

	Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
	; CHECK: DW_AT_name [DW_FORM_strp]{{.*}}"a"			; CHECK: DW_AT_name [DW_FORM_strp]{{.*}}"a"

	; CHECK: DW_TAG_variable			; CHECK: DW_TAG_variable
	; CHECK: DW_AT_location [DW_FORM_exprloc]			; CHECK: DW_AT_location [DW_FORM_exprloc]
	; CHECK-NOT: DW_AT_location			; CHECK-NOT: DW_AT_location

	; CHECK: .debug_loc contents:			; CHECK: .debug_loc contents:
	; CHECK: 0x00000000: Beginning address offset: 0x0000000000000000			; CHECK: 0x00000000: Beginning address offset: 0x0000000000000000
	; CHECK: Ending address offset: 0x000000000000001a			; CHECK: Ending address offset: 0x0000000000000017

	%struct.A = type { i32 (...)**, i32 }			%struct.A = type { i32 (...)**, i32 }

	; Function Attrs: nounwind			; Function Attrs: nounwind
	define i32 @_Z3bari(i32 %b) #0 {			define i32 @_Z3bari(i32 %b) #0 {
	entry:			entry:
	%b.addr = alloca i32, align 4			%b.addr = alloca i32, align 4
	store i32 %b, i32* %b.addr, align 4			store i32 %b, i32* %b.addr, align 4
	▲ Show 20 Lines • Show All 87 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[X86] New pass that moves immediate operands to registers.Needs ReviewPublic

Details

Diff Detail

Event Timeline

tries to

+#include "X86.h"

+

+

+

+

+

+

+

+

+

+

+ DATA_16,

+ ImmediateSize Size; // of immediate data

+};

+

+ if (OpCode == X86::MOV16ri)

+ if (OpCode == X86::MOV32ri)

+ if (Entire == Part)

+ return X86::NoSubRegister;

+ return false;

+ // instruction can be decreased.

Revision Contents

Diff 14236

lib/Target/X86/CMakeLists.txt

lib/Target/X86/X86.h

lib/Target/X86/X86MaterializeImmediates.cpp

lib/Target/X86/X86TargetMachine.cpp

test/CodeGen/X86/coalescer-commute3.ll

test/CodeGen/X86/fast-isel-x86.ll

test/CodeGen/X86/materialize-imm.ll

test/CodeGen/X86/memcpy-2.ll

test/CodeGen/X86/memset.ll

test/CodeGen/X86/memset64-on-x86-32.ll

test/CodeGen/X86/nancvt.ll

test/CodeGen/X86/pr14562.ll

test/CodeGen/X86/pr18023.ll

test/CodeGen/X86/tlv-1.ll

test/CodeGen/X86/xmulo.ll

test/DebugInfo/X86/debug-loc-offset.ll

[X86] New pass that moves immediate operands to registers.
Needs ReviewPublic