This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/X86/
-
Target/
-
X86/
4/5
X86CmovConversion.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
x86-cmov-converter.ll

Differential D119777

[X86] Introduce x86-cmov-converter-force-all
ClosedPublic

Authored by Amir on Feb 14 2022, 1:52 PM.

Download Raw Diff

Details

Reviewers

MaskRay
chandlerc
skan
RKSimon
apostolakis

Commits

rGe38fc14c43b0: [X86] Introduce x86-cmov-converter-force-all

Summary

Introduce an option to expand all CMOV groups into hammocks, matching GCC's
-fno-if-conversion2 flag. The motivation is to leave CMOV conversion
opportunities to a binary optimizer that can make the decision based on branch
misprediction rate (available e.g. in Intel's LBR).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Amir created this revision.Feb 14 2022, 1:52 PM

Herald added subscribers: pengfei, hiraditya. · View Herald TranscriptFeb 14 2022, 1:52 PM

Amir requested review of this revision.Feb 14 2022, 1:52 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2022, 1:52 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Amir added reviewers: MaskRay, chandlerc, skan.Feb 14 2022, 1:55 PM

Harbormaster completed remote builds in B149514: Diff 408606.Feb 14 2022, 3:53 PM

Could you point out the document for -fno-if-conversion2?

RKSimon added a reviewer: RKSimon.Feb 15 2022, 12:52 AM

In D119777#3322012, @skan wrote:

Could you point out the document for -fno-if-conversion2?

It guess it's in https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

In D119777#3322778, @pengfei wrote:

In D119777#3322012, @skan wrote:

Could you point out the document for -fno-if-conversion2?

It guess it's in https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Yes, it's there, as -fif-conversion2

In D119777#3324438, @Amir wrote:

In D119777#3322778, @pengfei wrote:

In D119777#3322012, @skan wrote:

Could you point out the document for -fno-if-conversion2?

It guess it's in https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Yes, it's there, as -fif-conversion2

The description for -fif-conversion2 reads as if it is enabling an optimization pass and that -fno-if-conversion2 just doesn't run that optimization pass. It doesn't read like -fno-if-conversion2 would expand cmovs. Is my understanding correct?

In D119777#3324555, @craig.topper wrote:

In D119777#3324438, @Amir wrote:

In D119777#3322778, @pengfei wrote:

In D119777#3322012, @skan wrote:

Could you point out the document for -fno-if-conversion2?

It guess it's in https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Yes, it's there, as -fif-conversion2

The description for -fif-conversion2 reads as if it is enabling an optimization pass and that -fno-if-conversion2 just doesn't run that optimization pass. It doesn't read like -fno-if-conversion2 would expand cmovs. Is my understanding correct?

Yes, it's correct. My understanding is that GCC handles cmov's the opposite way compared to LLVM:

GCC's IR has if-then-else constructs and cmov's are inserted by if-conversion pass (https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/ifcvt.cc;hb=3d8d8e34f796fefda53be9a6cec7c6c950856a14#l5782).
In LLVM, select is lowered into cmov which is expanded into if-then-else by X86CmovConversion pass.

The two approaches are not equivalent, but to keep cmov opportunities until after the compiler, one would need -fno-if-conversion2 in GCC and -x86-cmov-converter-force-all in LLVM.

skan added inline comments.Feb 15 2022, 7:41 PM

llvm/lib/Target/X86/X86CmovConversion.cpp
186–195	We should at least have a quick return for `ForceAll` here b/c there is no more CMOV.

Please add at least one test file.

Amir added inline comments.Feb 15 2022, 11:19 PM

llvm/lib/Target/X86/X86CmovConversion.cpp
186–195	What do you mean by "there's no more CMOV"? ForceAll would have to similarly collect all blocks, then collectCmovCandidates and iterate over AllCmovGroups. I don't see an early return option.

skan added inline comments.Feb 15 2022, 11:33 PM

llvm/lib/Target/X86/X86CmovConversion.cpp
186–195	The code after line 203 should not be executed when ForceAll is enabled, right?

Addressing feedback, added tests

Amir marked 2 inline comments as done.Feb 16 2022, 12:39 PM

Amir added inline comments.

llvm/lib/Target/X86/X86CmovConversion.cpp
186–195	Right, I see what you mean. Fixed in the updated rev.

Amir marked an inline comment as done.Feb 16 2022, 12:39 PM

Harbormaster completed remote builds in B150045: Diff 409366.Feb 16 2022, 1:50 PM

skan added inline comments.Feb 16 2022, 6:55 PM

llvm/lib/Target/X86/X86CmovConversion.cpp
101–104	LGTM in general. Please reformat this.

Add @apostolakis per https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040

Note Amir that the plan as outlined and discussed in the RFC that @MaskRay linked is to enable a new target-agnostic and profile-guided pass for cmov/branch decision-making and disable this profile-agnostic and x86-specific pass when profile information are available. Initially, the new pass will only be enabled for instr-FDO. But the next step is to enable it for Sample-FDO and as discussed with @modimo, this will take advantage of LBR data for computing misprediction rates. For the time being I am fine with enabling this option here but be aware that this x86 pass might be disabled in the near future.

In D119777#3330058, @apostolakis wrote:

Note Amir that the plan as outlined and discussed in the RFC that @MaskRay linked is to enable a new target-agnostic and profile-guided pass for cmov/branch decision-making and disable this profile-agnostic and x86-specific pass when profile information are available. Initially, the new pass will only be enabled for instr-FDO. But the next step is to enable it for Sample-FDO and as discussed with @modimo, this will take advantage of LBR data for computing misprediction rates. For the time being I am fine with enabling this option here but be aware that this x86 pass might be disabled in the near future.

Hi Sotiris,
Sounds good to me. The option added here is for experimentation ATM. Hoping to see the new pass with LBR data available soon!

skan accepted this revision.Feb 23 2022, 4:33 PM

This revision is now accepted and ready to land.Feb 23 2022, 4:33 PM

MaskRay accepted this revision.Feb 23 2022, 5:40 PM

This revision was landed with ongoing or failed builds.Feb 24 2022, 10:47 AM

Closed by commit rGe38fc14c43b0: [X86] Introduce x86-cmov-converter-force-all (authored by Amir). · Explain Why

This revision was automatically updated to reflect the committed changes.

Amir added a commit: rGe38fc14c43b0: [X86] Introduce x86-cmov-converter-force-all.

Amir mentioned this in D120230: [SelectOpti][1/5] Setup new select-optimize pass.Mar 14 2022, 10:34 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

X86/

X86CmovConversion.cpp

21 lines

test/

CodeGen/

X86/

x86-cmov-converter.ll

306 lines

Diff 411180

llvm/lib/Target/X86/X86CmovConversion.cpp

Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	GainCycleThreshold("x86-cmov-converter-threshold",
cl::desc("Minimum gain per loop (in cycles) threshold."),		cl::desc("Minimum gain per loop (in cycles) threshold."),
cl::init(4), cl::Hidden);		cl::init(4), cl::Hidden);

static cl::opt<bool> ForceMemOperand(		static cl::opt<bool> ForceMemOperand(
"x86-cmov-converter-force-mem-operand",		"x86-cmov-converter-force-mem-operand",
cl::desc("Convert cmovs to branches whenever they have memory operands."),		cl::desc("Convert cmovs to branches whenever they have memory operands."),
cl::init(true), cl::Hidden);		cl::init(true), cl::Hidden);

		static cl::opt<bool> ForceAll(
		"x86-cmov-converter-force-all",
		cl::desc("Convert all cmovs to branches."),
		cl::init(false), cl::Hidden);

		skanUnsubmitted Not Done Reply Inline Actions LGTM in general. Please reformat this. skan: LGTM in general. Please reformat this.
namespace {		namespace {

/// Converts X86 cmov instructions into branches when profitable.		/// Converts X86 cmov instructions into branches when profitable.
class X86CmovConverterPass : public MachineFunctionPass {		class X86CmovConverterPass : public MachineFunctionPass {
public:		public:
X86CmovConverterPass() : MachineFunctionPass(ID) { }		X86CmovConverterPass() : MachineFunctionPass(ID) { }

StringRef getPassName() const override { return "X86 cmov Conversion"; }		StringRef getPassName() const override { return "X86 cmov Conversion"; }
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	bool X86CmovConverterPass::runOnMachineFunction(MachineFunction &MF) {
MLI = &getAnalysis<MachineLoopInfo>();		MLI = &getAnalysis<MachineLoopInfo>();
const TargetSubtargetInfo &STI = MF.getSubtarget();		const TargetSubtargetInfo &STI = MF.getSubtarget();
MRI = &MF.getRegInfo();		MRI = &MF.getRegInfo();
TII = STI.getInstrInfo();		TII = STI.getInstrInfo();
TRI = STI.getRegisterInfo();		TRI = STI.getRegisterInfo();
TSchedModel.init(&STI);		TSchedModel.init(&STI);

// Before we handle the more subtle cases of register-register CMOVs inside		// Before we handle the more subtle cases of register-register CMOVs inside
// of potentially hot loops, we want to quickly remove all CMOVs with		// of potentially hot loops, we want to quickly remove all CMOVs (ForceAll) or
// a memory operand. The CMOV will risk a stall waiting for the load to		// the ones with a memory operand (ForceMemOperand option). The latter CMOV
// complete that speculative execution behind a branch is better suited to		// will risk a stall waiting for the load to complete that speculative
// handle on modern x86 chips.		// execution behind a branch is better suited to handle on modern x86 chips.
if (ForceMemOperand) {		if (ForceMemOperand \|\| ForceAll) {
CmovGroups AllCmovGroups;		CmovGroups AllCmovGroups;
SmallVector<MachineBasicBlock *, 4> Blocks;		SmallVector<MachineBasicBlock *, 4> Blocks;
for (auto &MBB : MF)		for (auto &MBB : MF)
Blocks.push_back(&MBB);		Blocks.push_back(&MBB);
if (collectCmovCandidates(Blocks, AllCmovGroups, /IncludeLoads/ true)) {		if (collectCmovCandidates(Blocks, AllCmovGroups, /IncludeLoads/ true)) {
for (auto &Group : AllCmovGroups) {		for (auto &Group : AllCmovGroups) {
// Skip any group that doesn't do at least one memory operand cmov.		// Skip any group that doesn't do at least one memory operand cmov.
if (llvm::none_of(Group, [&](MachineInstr *I) { return I->mayLoad(); }))		if (ForceMemOperand && !ForceAll &&
		llvm::none_of(Group, [&](MachineInstr *I) { return I->mayLoad(); }))
		skanUnsubmitted Done Reply Inline Actions We should at least have a quick return for `ForceAll` here b/c there is no more CMOV. skan: We should at least have a quick return for `ForceAll` here b/c there is no more CMOV.
		AmirAuthorUnsubmitted Done Reply Inline Actions What do you mean by "there's no more CMOV"? ForceAll would have to similarly collect all blocks, then collectCmovCandidates and iterate over AllCmovGroups. I don't see an early return option. Amir: What do you mean by "there's no more CMOV"? ForceAll would have to similarly collect all blocks…
		skanUnsubmitted Done Reply Inline Actions The code after line 203 should not be executed when ForceAll is enabled, right? skan: The code after line 203 should not be executed when ForceAll is enabled, right?
		AmirAuthorUnsubmitted Done Reply Inline Actions Right, I see what you mean. Fixed in the updated rev. Amir: Right, I see what you mean. Fixed in the updated rev.
continue;		continue;

// For CMOV groups which we can rewrite and which contain a memory load,		// For CMOV groups which we can rewrite and which contain a memory load,
// always rewrite them. On x86, a CMOV will dramatically amplify any		// always rewrite them. On x86, a CMOV will dramatically amplify any
// memory latency by blocking speculative execution.		// memory latency by blocking speculative execution.
Changed = true;		Changed = true;
convertCmovInstsToBranches(Group);		convertCmovInstsToBranches(Group);
}		}
}		}
		// Early return as ForceAll converts all CmovGroups.
		if (ForceAll)
		return Changed;
}		}

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Register-operand Conversion Algorithm		// Register-operand Conversion Algorithm
// ---------		// ---------
// For each inner most loop		// For each inner most loop
// collectCmovCandidates() {		// collectCmovCandidates() {
// Find all CMOV-group-candidates.		// Find all CMOV-group-candidates.
▲ Show 20 Lines • Show All 660 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/x86-cmov-converter.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=x86_64-pc-linux -x86-cmov-converter=true -verify-machineinstrs -disable-block-placement < %s \| FileCheck -allow-deprecated-dag-overlap %s			; RUN: llc -mtriple=x86_64-pc-linux -x86-cmov-converter=true -verify-machineinstrs -disable-block-placement < %s \| FileCheck -allow-deprecated-dag-overlap %s
				; RUN: llc -mtriple=x86_64-pc-linux -x86-cmov-converter=true -x86-cmov-converter-force-all=true -verify-machineinstrs -disable-block-placement < %s \| FileCheck -allow-deprecated-dag-overlap %s -check-prefix=CHECK-FORCEALL

	;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;			;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
	;; This test checks that x86-cmov-converter optimization transform CMOV			;; This test checks that x86-cmov-converter optimization transform CMOV
	;; instruction into branches when it is profitable.			;; instruction into branches when it is profitable.
	;; There are 5 cases below:			;; There are 5 cases below:
	;; 1. CmovInCriticalPath:			;; 1. CmovInCriticalPath:
	;; CMOV depends on the condition and it is in the hot path.			;; CMOV depends on the condition and it is in the hot path.
	;; Thus, it worths transforming.			;; Thus, it worths transforming.
	▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: # in Loop: Header=BB0_2 Depth=1			; CHECK-NEXT: # in Loop: Header=BB0_2 Depth=1
	; CHECK-NEXT: imull %r9d, %r10d			; CHECK-NEXT: imull %r9d, %r10d
	; CHECK-NEXT: movl %r10d, (%rcx,%rdi,4)			; CHECK-NEXT: movl %r10d, (%rcx,%rdi,4)
	; CHECK-NEXT: addq $1, %rdi			; CHECK-NEXT: addq $1, %rdi
	; CHECK-NEXT: cmpq %rdi, %r8			; CHECK-NEXT: cmpq %rdi, %r8
	; CHECK-NEXT: jne .LBB0_2			; CHECK-NEXT: jne .LBB0_2
	; CHECK-NEXT: .LBB0_5: # %for.cond.cleanup			; CHECK-NEXT: .LBB0_5: # %for.cond.cleanup
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: CmovInHotPath:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: testl %edi, %edi
				; CHECK-FORCEALL-NEXT: jle .LBB0_5
				; CHECK-FORCEALL-NEXT: # %bb.1: # %for.body.preheader
				; CHECK-FORCEALL-NEXT: movl %edi, %r8d
				; CHECK-FORCEALL-NEXT: xorl %edi, %edi
				; CHECK-FORCEALL-NEXT: .LBB0_2: # %for.body
				; CHECK-FORCEALL-NEXT: # =>This Inner Loop Header: Depth=1
				; CHECK-FORCEALL-NEXT: movl (%rcx,%rdi,4), %eax
				; CHECK-FORCEALL-NEXT: leal 1(%rax), %r9d
				; CHECK-FORCEALL-NEXT: imull %esi, %eax
				; CHECK-FORCEALL-NEXT: movl $10, %r10d
				; CHECK-FORCEALL-NEXT: cmpl %edx, %eax
				; CHECK-FORCEALL-NEXT: jg .LBB0_4
				; CHECK-FORCEALL-NEXT: # %bb.3: # %for.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB0_2 Depth=1
				; CHECK-FORCEALL-NEXT: movl %r9d, %r10d
				; CHECK-FORCEALL-NEXT: .LBB0_4: # %for.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB0_2 Depth=1
				; CHECK-FORCEALL-NEXT: imull %r9d, %r10d
				; CHECK-FORCEALL-NEXT: movl %r10d, (%rcx,%rdi,4)
				; CHECK-FORCEALL-NEXT: addq $1, %rdi
				; CHECK-FORCEALL-NEXT: cmpq %rdi, %r8
				; CHECK-FORCEALL-NEXT: jne .LBB0_2
				; CHECK-FORCEALL-NEXT: .LBB0_5: # %for.cond.cleanup
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cmp14 = icmp sgt i32 %n, 0			%cmp14 = icmp sgt i32 %n, 0
	br i1 %cmp14, label %for.body.preheader, label %for.cond.cleanup			br i1 %cmp14, label %for.body.preheader, label %for.cond.cleanup

	for.body.preheader: ; preds = %entry			for.body.preheader: ; preds = %entry
	%wide.trip.count = zext i32 %n to i64			%wide.trip.count = zext i32 %n to i64
	br label %for.body			br label %for.body

	Show All 37 Lines
	; CHECK-NEXT: cltd			; CHECK-NEXT: cltd
	; CHECK-NEXT: idivl %r9d			; CHECK-NEXT: idivl %r9d
	; CHECK-NEXT: movl %eax, (%r8,%rdi,4)			; CHECK-NEXT: movl %eax, (%r8,%rdi,4)
	; CHECK-NEXT: addq $1, %rdi			; CHECK-NEXT: addq $1, %rdi
	; CHECK-NEXT: cmpq %rdi, %r10			; CHECK-NEXT: cmpq %rdi, %r10
	; CHECK-NEXT: jne .LBB1_2			; CHECK-NEXT: jne .LBB1_2
	; CHECK-NEXT: .LBB1_3: # %for.cond.cleanup			; CHECK-NEXT: .LBB1_3: # %for.cond.cleanup
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: CmovNotInHotPath:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: testl %edi, %edi
				; CHECK-FORCEALL-NEXT: jle .LBB1_5
				; CHECK-FORCEALL-NEXT: # %bb.1: # %for.body.preheader
				; CHECK-FORCEALL-NEXT: movl %edx, %r9d
				; CHECK-FORCEALL-NEXT: movl %edi, %r10d
				; CHECK-FORCEALL-NEXT: xorl %edi, %edi
				; CHECK-FORCEALL-NEXT: .LBB1_2: # %for.body
				; CHECK-FORCEALL-NEXT: # =>This Inner Loop Header: Depth=1
				; CHECK-FORCEALL-NEXT: movl (%rcx,%rdi,4), %r11d
				; CHECK-FORCEALL-NEXT: movl %r11d, %eax
				; CHECK-FORCEALL-NEXT: imull %esi, %eax
				; CHECK-FORCEALL-NEXT: movl $10, %edx
				; CHECK-FORCEALL-NEXT: cmpl %r9d, %eax
				; CHECK-FORCEALL-NEXT: jg .LBB1_4
				; CHECK-FORCEALL-NEXT: # %bb.3: # %for.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB1_2 Depth=1
				; CHECK-FORCEALL-NEXT: movl %r11d, %edx
				; CHECK-FORCEALL-NEXT: .LBB1_4: # %for.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB1_2 Depth=1
				; CHECK-FORCEALL-NEXT: movl %edx, (%rcx,%rdi,4)
				; CHECK-FORCEALL-NEXT: movl (%r8,%rdi,4), %eax
				; CHECK-FORCEALL-NEXT: cltd
				; CHECK-FORCEALL-NEXT: idivl %r9d
				; CHECK-FORCEALL-NEXT: movl %eax, (%r8,%rdi,4)
				; CHECK-FORCEALL-NEXT: addq $1, %rdi
				; CHECK-FORCEALL-NEXT: cmpq %rdi, %r10
				; CHECK-FORCEALL-NEXT: jne .LBB1_2
				; CHECK-FORCEALL-NEXT: .LBB1_5: # %for.cond.cleanup
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cmp18 = icmp sgt i32 %n, 0			%cmp18 = icmp sgt i32 %n, 0
	br i1 %cmp18, label %for.body.preheader, label %for.cond.cleanup			br i1 %cmp18, label %for.body.preheader, label %for.cond.cleanup

	for.body.preheader: ; preds = %entry			for.body.preheader: ; preds = %entry
	%wide.trip.count = zext i32 %n to i64			%wide.trip.count = zext i32 %n to i64
	br label %for.body			br label %for.body

	Show All 40 Lines
	; CHECK-NEXT: .LBB2_4: # %for.body			; CHECK-NEXT: .LBB2_4: # %for.body
	; CHECK-NEXT: # in Loop: Header=BB2_2 Depth=1			; CHECK-NEXT: # in Loop: Header=BB2_2 Depth=1
	; CHECK-NEXT: addq $1, %rdx			; CHECK-NEXT: addq $1, %rdx
	; CHECK-NEXT: movl %eax, %edi			; CHECK-NEXT: movl %eax, %edi
	; CHECK-NEXT: cmpq %rdx, %r8			; CHECK-NEXT: cmpq %rdx, %r8
	; CHECK-NEXT: jne .LBB2_2			; CHECK-NEXT: jne .LBB2_2
	; CHECK-NEXT: .LBB2_5: # %for.cond.cleanup			; CHECK-NEXT: .LBB2_5: # %for.cond.cleanup
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: MaxIndex:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: xorl %eax, %eax
				; CHECK-FORCEALL-NEXT: cmpl $2, %edi
				; CHECK-FORCEALL-NEXT: jl .LBB2_5
				; CHECK-FORCEALL-NEXT: # %bb.1: # %for.body.preheader
				; CHECK-FORCEALL-NEXT: movl %edi, %r8d
				; CHECK-FORCEALL-NEXT: xorl %edi, %edi
				; CHECK-FORCEALL-NEXT: movl $1, %edx
				; CHECK-FORCEALL-NEXT: .LBB2_2: # %for.body
				; CHECK-FORCEALL-NEXT: # =>This Inner Loop Header: Depth=1
				; CHECK-FORCEALL-NEXT: movl (%rsi,%rdx,4), %r9d
				; CHECK-FORCEALL-NEXT: movslq %edi, %rcx
				; CHECK-FORCEALL-NEXT: movl %edx, %eax
				; CHECK-FORCEALL-NEXT: cmpl (%rsi,%rcx,4), %r9d
				; CHECK-FORCEALL-NEXT: jg .LBB2_4
				; CHECK-FORCEALL-NEXT: # %bb.3: # %for.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB2_2 Depth=1
				; CHECK-FORCEALL-NEXT: movl %edi, %eax
				; CHECK-FORCEALL-NEXT: .LBB2_4: # %for.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB2_2 Depth=1
				; CHECK-FORCEALL-NEXT: addq $1, %rdx
				; CHECK-FORCEALL-NEXT: movl %eax, %edi
				; CHECK-FORCEALL-NEXT: cmpq %rdx, %r8
				; CHECK-FORCEALL-NEXT: jne .LBB2_2
				; CHECK-FORCEALL-NEXT: .LBB2_5: # %for.cond.cleanup
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cmp14 = icmp sgt i32 %n, 1			%cmp14 = icmp sgt i32 %n, 1
	br i1 %cmp14, label %for.body.preheader, label %for.cond.cleanup			br i1 %cmp14, label %for.body.preheader, label %for.cond.cleanup

	for.body.preheader: ; preds = %entry			for.body.preheader: ; preds = %entry
	%wide.trip.count = zext i32 %n to i64			%wide.trip.count = zext i32 %n to i64
	br label %for.body			br label %for.body

	▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: .LBB3_4: # %for.body			; CHECK-NEXT: .LBB3_4: # %for.body
	; CHECK-NEXT: # in Loop: Header=BB3_2 Depth=1			; CHECK-NEXT: # in Loop: Header=BB3_2 Depth=1
	; CHECK-NEXT: addq $1, %rdx			; CHECK-NEXT: addq $1, %rdx
	; CHECK-NEXT: movl %eax, %edi			; CHECK-NEXT: movl %eax, %edi
	; CHECK-NEXT: cmpq %rdx, %r8			; CHECK-NEXT: cmpq %rdx, %r8
	; CHECK-NEXT: jne .LBB3_2			; CHECK-NEXT: jne .LBB3_2
	; CHECK-NEXT: .LBB3_5: # %for.cond.cleanup			; CHECK-NEXT: .LBB3_5: # %for.cond.cleanup
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: MaxIndex_unpredictable:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: xorl %eax, %eax
				; CHECK-FORCEALL-NEXT: cmpl $2, %edi
				; CHECK-FORCEALL-NEXT: jl .LBB3_5
				; CHECK-FORCEALL-NEXT: # %bb.1: # %for.body.preheader
				; CHECK-FORCEALL-NEXT: movl %edi, %r8d
				; CHECK-FORCEALL-NEXT: xorl %edi, %edi
				; CHECK-FORCEALL-NEXT: movl $1, %edx
				; CHECK-FORCEALL-NEXT: .LBB3_2: # %for.body
				; CHECK-FORCEALL-NEXT: # =>This Inner Loop Header: Depth=1
				; CHECK-FORCEALL-NEXT: movl (%rsi,%rdx,4), %r9d
				; CHECK-FORCEALL-NEXT: movslq %edi, %rcx
				; CHECK-FORCEALL-NEXT: movl %edx, %eax
				; CHECK-FORCEALL-NEXT: cmpl (%rsi,%rcx,4), %r9d
				; CHECK-FORCEALL-NEXT: jg .LBB3_4
				; CHECK-FORCEALL-NEXT: # %bb.3: # %for.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB3_2 Depth=1
				; CHECK-FORCEALL-NEXT: movl %edi, %eax
				; CHECK-FORCEALL-NEXT: .LBB3_4: # %for.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB3_2 Depth=1
				; CHECK-FORCEALL-NEXT: addq $1, %rdx
				; CHECK-FORCEALL-NEXT: movl %eax, %edi
				; CHECK-FORCEALL-NEXT: cmpq %rdx, %r8
				; CHECK-FORCEALL-NEXT: jne .LBB3_2
				; CHECK-FORCEALL-NEXT: .LBB3_5: # %for.cond.cleanup
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cmp14 = icmp sgt i32 %n, 1			%cmp14 = icmp sgt i32 %n, 1
	br i1 %cmp14, label %for.body.preheader, label %for.cond.cleanup			br i1 %cmp14, label %for.body.preheader, label %for.cond.cleanup

	for.body.preheader: ; preds = %entry			for.body.preheader: ; preds = %entry
	%wide.trip.count = zext i32 %n to i64			%wide.trip.count = zext i32 %n to i64
	br label %for.body			br label %for.body

	Show All 31 Lines
	; CHECK-NEXT: movl (%rsi,%rdx,4), %edi			; CHECK-NEXT: movl (%rsi,%rdx,4), %edi
	; CHECK-NEXT: cmpl %eax, %edi			; CHECK-NEXT: cmpl %eax, %edi
	; CHECK-NEXT: cmovgl %edi, %eax			; CHECK-NEXT: cmovgl %edi, %eax
	; CHECK-NEXT: addq $1, %rdx			; CHECK-NEXT: addq $1, %rdx
	; CHECK-NEXT: cmpq %rdx, %rcx			; CHECK-NEXT: cmpq %rdx, %rcx
	; CHECK-NEXT: jne .LBB4_2			; CHECK-NEXT: jne .LBB4_2
	; CHECK-NEXT: .LBB4_3: # %for.cond.cleanup			; CHECK-NEXT: .LBB4_3: # %for.cond.cleanup
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: MaxValue:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movl (%rsi), %ecx
				; CHECK-FORCEALL-NEXT: cmpl $2, %edi
				; CHECK-FORCEALL-NEXT: jge .LBB4_3
				; CHECK-FORCEALL-NEXT: # %bb.1:
				; CHECK-FORCEALL-NEXT: movl %ecx, %eax
				; CHECK-FORCEALL-NEXT: .LBB4_2: # %for.cond.cleanup
				; CHECK-FORCEALL-NEXT: retq
				; CHECK-FORCEALL-NEXT: .LBB4_3: # %for.body.preheader
				; CHECK-FORCEALL-NEXT: movl %edi, %edi
				; CHECK-FORCEALL-NEXT: movl $1, %edx
				; CHECK-FORCEALL-NEXT: .LBB4_4: # %for.body
				; CHECK-FORCEALL-NEXT: # =>This Inner Loop Header: Depth=1
				; CHECK-FORCEALL-NEXT: movl (%rsi,%rdx,4), %eax
				; CHECK-FORCEALL-NEXT: cmpl %ecx, %eax
				; CHECK-FORCEALL-NEXT: jg .LBB4_6
				; CHECK-FORCEALL-NEXT: # %bb.5: # %for.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB4_4 Depth=1
				; CHECK-FORCEALL-NEXT: movl %ecx, %eax
				; CHECK-FORCEALL-NEXT: .LBB4_6: # %for.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB4_4 Depth=1
				; CHECK-FORCEALL-NEXT: addq $1, %rdx
				; CHECK-FORCEALL-NEXT: movl %eax, %ecx
				; CHECK-FORCEALL-NEXT: cmpq %rdx, %rdi
				; CHECK-FORCEALL-NEXT: je .LBB4_2
				; CHECK-FORCEALL-NEXT: jmp .LBB4_4
	entry:			entry:
	%0 = load i32, i32* %a, align 4			%0 = load i32, i32* %a, align 4
	%cmp13 = icmp sgt i32 %n, 1			%cmp13 = icmp sgt i32 %n, 1
	br i1 %cmp13, label %for.body.preheader, label %for.cond.cleanup			br i1 %cmp13, label %for.body.preheader, label %for.cond.cleanup

	for.body.preheader: ; preds = %entry			for.body.preheader: ; preds = %entry
	%wide.trip.count = zext i32 %n to i64			%wide.trip.count = zext i32 %n to i64
	br label %for.body			br label %for.body
	Show All 28 Lines
	; CHECK-NEXT: movq 8(%rdx,%rcx,8), %rdx			; CHECK-NEXT: movq 8(%rdx,%rcx,8), %rdx
	; CHECK-NEXT: .LBB5_2: # %while.body			; CHECK-NEXT: .LBB5_2: # %while.body
	; CHECK-NEXT: # =>This Inner Loop Header: Depth=1			; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: movl (%rdx), %ecx			; CHECK-NEXT: movl (%rdx), %ecx
	; CHECK-NEXT: cmpl %ecx, %eax			; CHECK-NEXT: cmpl %ecx, %eax
	; CHECK-NEXT: ja .LBB5_1			; CHECK-NEXT: ja .LBB5_1
	; CHECK-NEXT: # %bb.3: # %while.end			; CHECK-NEXT: # %bb.3: # %while.end
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: BinarySearch:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movl (%rsi), %eax
				; CHECK-FORCEALL-NEXT: jmp .LBB5_2
				; CHECK-FORCEALL-NEXT: .LBB5_1: # %while.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB5_2 Depth=1
				; CHECK-FORCEALL-NEXT: movl %ecx, %eax
				; CHECK-FORCEALL-NEXT: xorl %ecx, %ecx
				; CHECK-FORCEALL-NEXT: btl %eax, %edi
				; CHECK-FORCEALL-NEXT: setae %cl
				; CHECK-FORCEALL-NEXT: movq 8(%rdx,%rcx,8), %rdx
				; CHECK-FORCEALL-NEXT: .LBB5_2: # %while.body
				; CHECK-FORCEALL-NEXT: # =>This Inner Loop Header: Depth=1
				; CHECK-FORCEALL-NEXT: movl (%rdx), %ecx
				; CHECK-FORCEALL-NEXT: cmpl %ecx, %eax
				; CHECK-FORCEALL-NEXT: ja .LBB5_1
				; CHECK-FORCEALL-NEXT: # %bb.3: # %while.end
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%Val8 = getelementptr inbounds %struct.Node, %struct.Node* %Curr, i64 0, i32 0			%Val8 = getelementptr inbounds %struct.Node, %struct.Node* %Curr, i64 0, i32 0
	%0 = load i32, i32* %Val8, align 8			%0 = load i32, i32* %Val8, align 8
	%Val19 = getelementptr inbounds %struct.Node, %struct.Node* %Next, i64 0, i32 0			%Val19 = getelementptr inbounds %struct.Node, %struct.Node* %Next, i64 0, i32 0
	%1 = load i32, i32* %Val19, align 8			%1 = load i32, i32* %Val19, align 8
	%cmp10 = icmp ugt i32 %0, %1			%cmp10 = icmp ugt i32 %0, %1
	br i1 %cmp10, label %while.body, label %while.end			br i1 %cmp10, label %while.body, label %while.end

	▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: xorl %edx, %edx			; CHECK-NEXT: xorl %edx, %edx
	; CHECK-NEXT: divl %ecx			; CHECK-NEXT: divl %ecx
	; CHECK-NEXT: movl %edx, (%rdi,%rsi,4)			; CHECK-NEXT: movl %edx, (%rdi,%rsi,4)
	; CHECK-NEXT: addl $1, %esi			; CHECK-NEXT: addl $1, %esi
	; CHECK-NEXT: cmpl %r9d, %esi			; CHECK-NEXT: cmpl %r9d, %esi
	; CHECK-NEXT: ja .LBB6_2			; CHECK-NEXT: ja .LBB6_2
	; CHECK-NEXT: .LBB6_5: # %while.end			; CHECK-NEXT: .LBB6_5: # %while.end
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: Transform:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movb $1, %al
				; CHECK-FORCEALL-NEXT: testb %al, %al
				; CHECK-FORCEALL-NEXT: jne .LBB6_5
				; CHECK-FORCEALL-NEXT: # %bb.1: # %while.body.preheader
				; CHECK-FORCEALL-NEXT: movl %edx, %r8d
				; CHECK-FORCEALL-NEXT: xorl %esi, %esi
				; CHECK-FORCEALL-NEXT: .LBB6_2: # %while.body
				; CHECK-FORCEALL-NEXT: # =>This Inner Loop Header: Depth=1
				; CHECK-FORCEALL-NEXT: movslq %esi, %rsi
				; CHECK-FORCEALL-NEXT: movl (%rdi,%rsi,4), %eax
				; CHECK-FORCEALL-NEXT: xorl %edx, %edx
				; CHECK-FORCEALL-NEXT: divl %r8d
				; CHECK-FORCEALL-NEXT: movl %eax, %edx
				; CHECK-FORCEALL-NEXT: movl $11, %eax
				; CHECK-FORCEALL-NEXT: movl %r8d, %ecx
				; CHECK-FORCEALL-NEXT: cmpl %r8d, %edx
				; CHECK-FORCEALL-NEXT: ja .LBB6_4
				; CHECK-FORCEALL-NEXT: # %bb.3: # %while.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB6_2 Depth=1
				; CHECK-FORCEALL-NEXT: movl $22, %eax
				; CHECK-FORCEALL-NEXT: movl $22, %ecx
				; CHECK-FORCEALL-NEXT: .LBB6_4: # %while.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB6_2 Depth=1
				; CHECK-FORCEALL-NEXT: xorl %edx, %edx
				; CHECK-FORCEALL-NEXT: divl %ecx
				; CHECK-FORCEALL-NEXT: movl %edx, (%rdi,%rsi,4)
				; CHECK-FORCEALL-NEXT: addl $1, %esi
				; CHECK-FORCEALL-NEXT: cmpl %r9d, %esi
				; CHECK-FORCEALL-NEXT: ja .LBB6_2
				; CHECK-FORCEALL-NEXT: .LBB6_5: # %while.end
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cmp10 = icmp ugt i32 0, %n			%cmp10 = icmp ugt i32 0, %n
	br i1 %cmp10, label %while.body, label %while.end			br i1 %cmp10, label %while.body, label %while.end

	while.body: ; preds = %entry, %while.body			while.body: ; preds = %entry, %while.body
	%i = phi i32 [ %i_inc, %while.body ], [ 0, %entry ]			%i = phi i32 [ %i_inc, %while.body ], [ 0, %entry ]
	%arr_i = getelementptr inbounds i32, i32* %arr, i32 %i			%arr_i = getelementptr inbounds i32, i32* %arr, i32 %i
	%x = load i32, i32* %arr_i, align 4			%x = load i32, i32* %arr_i, align 4
	Show All 19 Lines
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl %edx, %eax			; CHECK-NEXT: movl %edx, %eax
	; CHECK-NEXT: cmpl %esi, %edi			; CHECK-NEXT: cmpl %esi, %edi
	; CHECK-NEXT: ja .LBB7_2			; CHECK-NEXT: ja .LBB7_2
	; CHECK-NEXT: # %bb.1: # %entry			; CHECK-NEXT: # %bb.1: # %entry
	; CHECK-NEXT: movl (%rcx), %eax			; CHECK-NEXT: movl (%rcx), %eax
	; CHECK-NEXT: .LBB7_2: # %entry			; CHECK-NEXT: .LBB7_2: # %entry
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: test_cmov_memoperand:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movl %edx, %eax
				; CHECK-FORCEALL-NEXT: cmpl %esi, %edi
				; CHECK-FORCEALL-NEXT: ja .LBB7_2
				; CHECK-FORCEALL-NEXT: # %bb.1: # %entry
				; CHECK-FORCEALL-NEXT: movl (%rcx), %eax
				; CHECK-FORCEALL-NEXT: .LBB7_2: # %entry
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cond = icmp ugt i32 %a, %b			%cond = icmp ugt i32 %a, %b
	%load = load i32, i32* %y			%load = load i32, i32* %y
	%z = select i1 %cond, i32 %x, i32 %load			%z = select i1 %cond, i32 %x, i32 %load
	ret i32 %z			ret i32 %z
	}			}

	; TODO: If cmov instruction is marked as unpredicatable, do not convert it to branch.			; TODO: If cmov instruction is marked as unpredicatable, do not convert it to branch.
	define i32 @test_cmov_memoperand_unpredictable(i32 %a, i32 %b, i32 %x, i32* %y) #0 {			define i32 @test_cmov_memoperand_unpredictable(i32 %a, i32 %b, i32 %x, i32* %y) #0 {
	; CHECK-LABEL: test_cmov_memoperand_unpredictable:			; CHECK-LABEL: test_cmov_memoperand_unpredictable:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl %edx, %eax			; CHECK-NEXT: movl %edx, %eax
	; CHECK-NEXT: cmpl %esi, %edi			; CHECK-NEXT: cmpl %esi, %edi
	; CHECK-NEXT: ja .LBB8_2			; CHECK-NEXT: ja .LBB8_2
	; CHECK-NEXT: # %bb.1: # %entry			; CHECK-NEXT: # %bb.1: # %entry
	; CHECK-NEXT: movl (%rcx), %eax			; CHECK-NEXT: movl (%rcx), %eax
	; CHECK-NEXT: .LBB8_2: # %entry			; CHECK-NEXT: .LBB8_2: # %entry
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: test_cmov_memoperand_unpredictable:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movl %edx, %eax
				; CHECK-FORCEALL-NEXT: cmpl %esi, %edi
				; CHECK-FORCEALL-NEXT: ja .LBB8_2
				; CHECK-FORCEALL-NEXT: # %bb.1: # %entry
				; CHECK-FORCEALL-NEXT: movl (%rcx), %eax
				; CHECK-FORCEALL-NEXT: .LBB8_2: # %entry
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cond = icmp ugt i32 %a, %b			%cond = icmp ugt i32 %a, %b
	%load = load i32, i32* %y			%load = load i32, i32* %y
	%z = select i1 %cond, i32 %x, i32 %load, !unpredictable !0			%z = select i1 %cond, i32 %x, i32 %load, !unpredictable !0
	ret i32 %z			ret i32 %z
	}			}

	; Test that we can convert a group of cmovs where only one has a memory			; Test that we can convert a group of cmovs where only one has a memory
	; operand.			; operand.
	define i32 @test_cmov_memoperand_in_group(i32 %a, i32 %b, i32 %x, i32* %y.ptr) #0 {			define i32 @test_cmov_memoperand_in_group(i32 %a, i32 %b, i32 %x, i32* %y.ptr) #0 {
	; CHECK-LABEL: test_cmov_memoperand_in_group:			; CHECK-LABEL: test_cmov_memoperand_in_group:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl %edx, %eax			; CHECK-NEXT: movl %edx, %eax
	; CHECK-NEXT: movl %edx, %r8d			; CHECK-NEXT: movl %edx, %r8d
	; CHECK-NEXT: cmpl %esi, %edi			; CHECK-NEXT: cmpl %esi, %edi
	; CHECK-NEXT: ja .LBB9_2			; CHECK-NEXT: ja .LBB9_2
	; CHECK-NEXT: # %bb.1: # %entry			; CHECK-NEXT: # %bb.1: # %entry
	; CHECK-NEXT: movl (%rcx), %r8d			; CHECK-NEXT: movl (%rcx), %r8d
	; CHECK-NEXT: movl %edi, %eax			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: movl %esi, %edx			; CHECK-NEXT: movl %esi, %edx
	; CHECK-NEXT: .LBB9_2: # %entry			; CHECK-NEXT: .LBB9_2: # %entry
	; CHECK-NEXT: addl %r8d, %eax			; CHECK-NEXT: addl %r8d, %eax
	; CHECK-NEXT: addl %edx, %eax			; CHECK-NEXT: addl %edx, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: test_cmov_memoperand_in_group:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movl %edx, %eax
				; CHECK-FORCEALL-NEXT: movl %edx, %r8d
				; CHECK-FORCEALL-NEXT: cmpl %esi, %edi
				; CHECK-FORCEALL-NEXT: ja .LBB9_2
				; CHECK-FORCEALL-NEXT: # %bb.1: # %entry
				; CHECK-FORCEALL-NEXT: movl (%rcx), %r8d
				; CHECK-FORCEALL-NEXT: movl %edi, %eax
				; CHECK-FORCEALL-NEXT: movl %esi, %edx
				; CHECK-FORCEALL-NEXT: .LBB9_2: # %entry
				; CHECK-FORCEALL-NEXT: addl %r8d, %eax
				; CHECK-FORCEALL-NEXT: addl %edx, %eax
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cond = icmp ugt i32 %a, %b			%cond = icmp ugt i32 %a, %b
	%y = load i32, i32* %y.ptr			%y = load i32, i32* %y.ptr
	%z1 = select i1 %cond, i32 %x, i32 %a			%z1 = select i1 %cond, i32 %x, i32 %a
	%z2 = select i1 %cond, i32 %x, i32 %y			%z2 = select i1 %cond, i32 %x, i32 %y
	%z3 = select i1 %cond, i32 %x, i32 %b			%z3 = select i1 %cond, i32 %x, i32 %b
	%s1 = add i32 %z1, %z2			%s1 = add i32 %z1, %z2
	%s2 = add i32 %s1, %z3			%s2 = add i32 %s1, %z3
	Show All 11 Lines
	; CHECK-NEXT: # %bb.1: # %entry			; CHECK-NEXT: # %bb.1: # %entry
	; CHECK-NEXT: movl (%rcx), %r8d			; CHECK-NEXT: movl (%rcx), %r8d
	; CHECK-NEXT: movl %edi, %eax			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: movl %esi, %edx			; CHECK-NEXT: movl %esi, %edx
	; CHECK-NEXT: .LBB10_2: # %entry			; CHECK-NEXT: .LBB10_2: # %entry
	; CHECK-NEXT: addl %r8d, %eax			; CHECK-NEXT: addl %r8d, %eax
	; CHECK-NEXT: addl %edx, %eax			; CHECK-NEXT: addl %edx, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: test_cmov_memoperand_in_group2:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movl %edx, %eax
				; CHECK-FORCEALL-NEXT: movl %edx, %r8d
				; CHECK-FORCEALL-NEXT: cmpl %esi, %edi
				; CHECK-FORCEALL-NEXT: jbe .LBB10_2
				; CHECK-FORCEALL-NEXT: # %bb.1: # %entry
				; CHECK-FORCEALL-NEXT: movl (%rcx), %r8d
				; CHECK-FORCEALL-NEXT: movl %edi, %eax
				; CHECK-FORCEALL-NEXT: movl %esi, %edx
				; CHECK-FORCEALL-NEXT: .LBB10_2: # %entry
				; CHECK-FORCEALL-NEXT: addl %r8d, %eax
				; CHECK-FORCEALL-NEXT: addl %edx, %eax
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cond = icmp ugt i32 %a, %b			%cond = icmp ugt i32 %a, %b
	%y = load i32, i32* %y.ptr			%y = load i32, i32* %y.ptr
	%z2 = select i1 %cond, i32 %a, i32 %x			%z2 = select i1 %cond, i32 %a, i32 %x
	%z1 = select i1 %cond, i32 %y, i32 %x			%z1 = select i1 %cond, i32 %y, i32 %x
	%z3 = select i1 %cond, i32 %b, i32 %x			%z3 = select i1 %cond, i32 %b, i32 %x
	%s1 = add i32 %z1, %z2			%s1 = add i32 %z1, %z2
	%s2 = add i32 %s1, %z3			%s2 = add i32 %s1, %z3
	ret i32 %s2			ret i32 %s2
	}			}

	; Test that we don't convert a group of cmovs with conflicting directions of			; Test that we don't convert a group of cmovs with conflicting directions of
	; loads.			; loads.
	define i32 @test_cmov_memoperand_conflicting_dir(i32 %a, i32 %b, i32 %x, i32* %y1.ptr, i32* %y2.ptr) #0 {			define i32 @test_cmov_memoperand_conflicting_dir(i32 %a, i32 %b, i32 %x, i32* %y1.ptr, i32* %y2.ptr) #0 {
	; CHECK-LABEL: test_cmov_memoperand_conflicting_dir:			; CHECK-LABEL: test_cmov_memoperand_conflicting_dir:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: cmpl %esi, %edi			; CHECK-NEXT: cmpl %esi, %edi
	; CHECK-NEXT: movl (%rcx), %eax			; CHECK-NEXT: movl (%rcx), %eax
	; CHECK-NEXT: cmoval %edx, %eax			; CHECK-NEXT: cmoval %edx, %eax
	; CHECK-NEXT: cmoval (%r8), %edx			; CHECK-NEXT: cmoval (%r8), %edx
	; CHECK-NEXT: addl %edx, %eax			; CHECK-NEXT: addl %edx, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: test_cmov_memoperand_conflicting_dir:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: cmpl %esi, %edi
				; CHECK-FORCEALL-NEXT: movl (%rcx), %eax
				; CHECK-FORCEALL-NEXT: cmoval %edx, %eax
				; CHECK-FORCEALL-NEXT: cmoval (%r8), %edx
				; CHECK-FORCEALL-NEXT: addl %edx, %eax
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cond = icmp ugt i32 %a, %b			%cond = icmp ugt i32 %a, %b
	%y1 = load i32, i32* %y1.ptr			%y1 = load i32, i32* %y1.ptr
	%y2 = load i32, i32* %y2.ptr			%y2 = load i32, i32* %y2.ptr
	%z1 = select i1 %cond, i32 %x, i32 %y1			%z1 = select i1 %cond, i32 %x, i32 %y1
	%z2 = select i1 %cond, i32 %y2, i32 %x			%z2 = select i1 %cond, i32 %y2, i32 %x
	%s1 = add i32 %z1, %z2			%s1 = add i32 %z1, %z2
	ret i32 %s1			ret i32 %s1
	}			}

	; Test that we can convert a group of cmovs where only one has a memory			; Test that we can convert a group of cmovs where only one has a memory
	; operand and where that memory operand's registers come from a prior cmov in			; operand and where that memory operand's registers come from a prior cmov in
	; the group.			; the group.
	define i32 @test_cmov_memoperand_in_group_reuse_for_addr(i32 %a, i32 %b, i32* %x, i32* %y) #0 {			define i32 @test_cmov_memoperand_in_group_reuse_for_addr(i32 %a, i32 %b, i32* %x, i32* %y) #0 {
	; CHECK-LABEL: test_cmov_memoperand_in_group_reuse_for_addr:			; CHECK-LABEL: test_cmov_memoperand_in_group_reuse_for_addr:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl %edi, %eax			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: cmpl %esi, %edi			; CHECK-NEXT: cmpl %esi, %edi
	; CHECK-NEXT: ja .LBB12_2			; CHECK-NEXT: ja .LBB12_2
	; CHECK-NEXT: # %bb.1: # %entry			; CHECK-NEXT: # %bb.1: # %entry
	; CHECK-NEXT: movl (%rcx), %eax			; CHECK-NEXT: movl (%rcx), %eax
	; CHECK-NEXT: .LBB12_2: # %entry			; CHECK-NEXT: .LBB12_2: # %entry
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: test_cmov_memoperand_in_group_reuse_for_addr:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movl %edi, %eax
				; CHECK-FORCEALL-NEXT: cmpl %esi, %edi
				; CHECK-FORCEALL-NEXT: ja .LBB12_2
				; CHECK-FORCEALL-NEXT: # %bb.1: # %entry
				; CHECK-FORCEALL-NEXT: movl (%rcx), %eax
				; CHECK-FORCEALL-NEXT: .LBB12_2: # %entry
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cond = icmp ugt i32 %a, %b			%cond = icmp ugt i32 %a, %b
	%p = select i1 %cond, i32* %x, i32* %y			%p = select i1 %cond, i32* %x, i32* %y
	%load = load i32, i32* %p			%load = load i32, i32* %p
	%z = select i1 %cond, i32 %a, i32 %load			%z = select i1 %cond, i32 %a, i32 %load
	ret i32 %z			ret i32 %z
	}			}

	; Test that we can convert a group of two cmovs with memory operands where one			; Test that we can convert a group of two cmovs with memory operands where one
	; uses the result of the other as part of the address.			; uses the result of the other as part of the address.
	define i32 @test_cmov_memoperand_in_group_reuse_for_addr2(i32 %a, i32 %b, i32* %x, i32** %y) #0 {			define i32 @test_cmov_memoperand_in_group_reuse_for_addr2(i32 %a, i32 %b, i32* %x, i32** %y) #0 {
	; CHECK-LABEL: test_cmov_memoperand_in_group_reuse_for_addr2:			; CHECK-LABEL: test_cmov_memoperand_in_group_reuse_for_addr2:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl %edi, %eax			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: cmpl %esi, %edi			; CHECK-NEXT: cmpl %esi, %edi
	; CHECK-NEXT: ja .LBB13_2			; CHECK-NEXT: ja .LBB13_2
	; CHECK-NEXT: # %bb.1: # %entry			; CHECK-NEXT: # %bb.1: # %entry
	; CHECK-NEXT: movq (%rcx), %rax			; CHECK-NEXT: movq (%rcx), %rax
	; CHECK-NEXT: movl (%rax), %eax			; CHECK-NEXT: movl (%rax), %eax
	; CHECK-NEXT: .LBB13_2: # %entry			; CHECK-NEXT: .LBB13_2: # %entry
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: test_cmov_memoperand_in_group_reuse_for_addr2:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movl %edi, %eax
				; CHECK-FORCEALL-NEXT: cmpl %esi, %edi
				; CHECK-FORCEALL-NEXT: ja .LBB13_2
				; CHECK-FORCEALL-NEXT: # %bb.1: # %entry
				; CHECK-FORCEALL-NEXT: movq (%rcx), %rax
				; CHECK-FORCEALL-NEXT: movl (%rax), %eax
				; CHECK-FORCEALL-NEXT: .LBB13_2: # %entry
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cond = icmp ugt i32 %a, %b			%cond = icmp ugt i32 %a, %b
	%load1 = load i32, i32* %y			%load1 = load i32, i32* %y
	%p = select i1 %cond, i32* %x, i32* %load1			%p = select i1 %cond, i32* %x, i32* %load1
	%load2 = load i32, i32* %p			%load2 = load i32, i32* %p
	%z = select i1 %cond, i32 %a, i32 %load2			%z = select i1 %cond, i32 %a, i32 %load2
	ret i32 %z			ret i32 %z
	}			}

	; Test that we can convert a group of cmovs where only one has a memory			; Test that we can convert a group of cmovs where only one has a memory
	; operand and where that memory operand's registers come from a prior cmov and			; operand and where that memory operand's registers come from a prior cmov and
	; where that cmov gets its input from a prior cmov in the group.			; where that cmov gets its input from a prior cmov in the group.
	define i32 @test_cmov_memoperand_in_group_reuse_for_addr3(i32 %a, i32 %b, i32* %x, i32* %y, i32* %z) #0 {			define i32 @test_cmov_memoperand_in_group_reuse_for_addr3(i32 %a, i32 %b, i32* %x, i32* %y, i32* %z) #0 {
	; CHECK-LABEL: test_cmov_memoperand_in_group_reuse_for_addr3:			; CHECK-LABEL: test_cmov_memoperand_in_group_reuse_for_addr3:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl %edi, %eax			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: cmpl %esi, %edi			; CHECK-NEXT: cmpl %esi, %edi
	; CHECK-NEXT: ja .LBB14_2			; CHECK-NEXT: ja .LBB14_2
	; CHECK-NEXT: # %bb.1: # %entry			; CHECK-NEXT: # %bb.1: # %entry
	; CHECK-NEXT: movl (%rcx), %eax			; CHECK-NEXT: movl (%rcx), %eax
	; CHECK-NEXT: .LBB14_2: # %entry			; CHECK-NEXT: .LBB14_2: # %entry
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: test_cmov_memoperand_in_group_reuse_for_addr3:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movl %edi, %eax
				; CHECK-FORCEALL-NEXT: cmpl %esi, %edi
				; CHECK-FORCEALL-NEXT: ja .LBB14_2
				; CHECK-FORCEALL-NEXT: # %bb.1: # %entry
				; CHECK-FORCEALL-NEXT: movl (%rcx), %eax
				; CHECK-FORCEALL-NEXT: .LBB14_2: # %entry
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%cond = icmp ugt i32 %a, %b			%cond = icmp ugt i32 %a, %b
	%p = select i1 %cond, i32* %x, i32* %y			%p = select i1 %cond, i32* %x, i32* %y
	%p2 = select i1 %cond, i32* %z, i32* %p			%p2 = select i1 %cond, i32* %z, i32* %p
	%load = load i32, i32* %p2			%load = load i32, i32* %p2
	%r = select i1 %cond, i32 %a, i32 %load			%r = select i1 %cond, i32 %a, i32 %load
	ret i32 %r			ret i32 %r
	}			}
	Show All 30 Lines
	; CHECK-NEXT: .LBB15_5: # %loop.body			; CHECK-NEXT: .LBB15_5: # %loop.body
	; CHECK-NEXT: # in Loop: Header=BB15_1 Depth=1			; CHECK-NEXT: # in Loop: Header=BB15_1 Depth=1
	; CHECK-NEXT: movl %edi, (%rcx)			; CHECK-NEXT: movl %edi, (%rcx)
	; CHECK-NEXT: addl $1, %esi			; CHECK-NEXT: addl $1, %esi
	; CHECK-NEXT: cmpl $1024, %esi # imm = 0x400			; CHECK-NEXT: cmpl $1024, %esi # imm = 0x400
	; CHECK-NEXT: jl .LBB15_1			; CHECK-NEXT: jl .LBB15_1
	; CHECK-NEXT: # %bb.6: # %exit			; CHECK-NEXT: # %bb.6: # %exit
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
				; CHECK-FORCEALL-LABEL: test_memoperand_loop:
				; CHECK-FORCEALL: # %bb.0: # %entry
				; CHECK-FORCEALL-NEXT: movq begin@GOTPCREL(%rip), %r8
				; CHECK-FORCEALL-NEXT: movq (%r8), %rax
				; CHECK-FORCEALL-NEXT: movq end@GOTPCREL(%rip), %rcx
				; CHECK-FORCEALL-NEXT: movq (%rcx), %rdx
				; CHECK-FORCEALL-NEXT: xorl %esi, %esi
				; CHECK-FORCEALL-NEXT: movq %rax, %rcx
				; CHECK-FORCEALL-NEXT: .LBB15_1: # %loop.body
				; CHECK-FORCEALL-NEXT: # =>This Inner Loop Header: Depth=1
				; CHECK-FORCEALL-NEXT: addq $8, %rcx
				; CHECK-FORCEALL-NEXT: cmpq %rdx, %rcx
				; CHECK-FORCEALL-NEXT: ja .LBB15_3
				; CHECK-FORCEALL-NEXT: # %bb.2: # %loop.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB15_1 Depth=1
				; CHECK-FORCEALL-NEXT: movq (%r8), %rcx
				; CHECK-FORCEALL-NEXT: .LBB15_3: # %loop.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB15_1 Depth=1
				; CHECK-FORCEALL-NEXT: movl %edi, (%rcx)
				; CHECK-FORCEALL-NEXT: addq $8, %rcx
				; CHECK-FORCEALL-NEXT: cmpq %rdx, %rcx
				; CHECK-FORCEALL-NEXT: ja .LBB15_5
				; CHECK-FORCEALL-NEXT: # %bb.4: # %loop.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB15_1 Depth=1
				; CHECK-FORCEALL-NEXT: movq %rax, %rcx
				; CHECK-FORCEALL-NEXT: .LBB15_5: # %loop.body
				; CHECK-FORCEALL-NEXT: # in Loop: Header=BB15_1 Depth=1
				; CHECK-FORCEALL-NEXT: movl %edi, (%rcx)
				; CHECK-FORCEALL-NEXT: addl $1, %esi
				; CHECK-FORCEALL-NEXT: cmpl $1024, %esi # imm = 0x400
				; CHECK-FORCEALL-NEXT: jl .LBB15_1
				; CHECK-FORCEALL-NEXT: # %bb.6: # %exit
				; CHECK-FORCEALL-NEXT: retq
	entry:			entry:
	%begin = load i32, i32* @begin, align 8			%begin = load i32, i32* @begin, align 8
	%end = load i32, i32* @end, align 8			%end = load i32, i32* @end, align 8
	br label %loop.body			br label %loop.body
	loop.body:			loop.body:
	%phi.iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.body ]			%phi.iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.body ]
	%phi.ptr = phi i32* [ %begin, %entry ], [ %dst2, %loop.body ]			%phi.ptr = phi i32* [ %begin, %entry ], [ %dst2, %loop.body ]
	%gep1 = getelementptr inbounds i32, i32 *%phi.ptr, i64 2			%gep1 = getelementptr inbounds i32, i32 *%phi.ptr, i64 2
	Show All 17 Lines