This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/X86/
-
Target/
-
X86/
-
X86InstrInfo.h
6/9
X86InstrInfo.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
optimize-compare.mir
-
peep-test-5.ll
-
use-cr-result-of-dom-icmp-st.ll

Differential D110867

X86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr
ClosedPublic

Authored by MatzeB on Sep 30 2021, 11:19 AM.

Download Raw Diff

Details

Reviewers

RKSimon
craig.topper
nikic
manmanren

Commits

rG847a6807332b: X86InstrInfo: Support immediates that are +1/-1 different in…
rGe2c7ee074359: X86InstrInfo: Support immediates that are +1/-1 different in…

Summary

This extends optimizeCompareInstr to re-use previous comparison results if the previous comparison was with an immediate that was 1 bigger or smaller. Example:

CMP x, 13
...
CMP x, 12   ; can be removed if we change the SETg
SETg ...    ; x > 12  changed to `SETge` (x >= 13) removing CMP

Motivation: This often happens because SelectionDAG canonicalization tends to add/subtract 1 often when optimizing for fallthrough blocks. Example for x > C the fallthrough optimization switches true/false blocks with !(x > C) --> x <= C and canonicalization turns this into x < C + 1.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	50 ms	x64 debian > LLVM.Bindings/Go::go.test

Event Timeline

MatzeB created this revision.Sep 30 2021, 11:19 AM

Herald added subscribers: wenlei, pengfei, hiraditya, mcrosier. · View Herald TranscriptSep 30 2021, 11:19 AM

MatzeB requested review of this revision.Sep 30 2021, 11:19 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 30 2021, 11:19 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

MatzeB updated this revision to Diff 376297.Sep 30 2021, 11:21 AM

MatzeB added a parent revision: D110865: X86InstrInfo: Optimize more combinations of SUB+CMP.

MatzeB edited the summary of this revision. (Show Details)Sep 30 2021, 11:33 AM

MatzeB mentioned this in D110339: SelectionDAGBuilder: Improve canonicalization by not swapping branch targets.Sep 30 2021, 11:38 AM

foad added a subscriber: foad.Oct 1 2021, 6:20 AM

foad added inline comments.

llvm/lib/Target/X86/X86InstrInfo.cpp
4567	Is it OK to use `INT64_MIN >> Shift` here? I think the standard says right shift of a negative value is implementation-defined.

MatzeB marked an inline comment as done.Oct 1 2021, 10:00 AM

MatzeB added inline comments.

llvm/lib/Target/X86/X86InstrInfo.cpp
4567	Good question! I think we can rely on two-complement arithmetic shift behavior here. I'm not aware of any compiler behaving differently, certainly not the one ones specified to be used for LLVM in the documentation: https://llvm.org/docs/GettingStarted.html#host-c-toolchain-both-compiler-and-standard-library Other code like APInt seems to rely on this as well: https://github.com/llvm/llvm-project/blob/4f0225f6d21b601d62b73dce913bf59d8fb93d87/llvm/include/llvm/ADT/APInt.h#L796 That said if you have a simple alternative to this, I'd be happy to change the code...

MatzeB marked an inline comment as done.Oct 1 2021, 10:04 AM

MatzeB added inline comments.Oct 1 2021, 10:11 AM

llvm/lib/Target/X86/X86InstrInfo.cpp
4567	How about I add this: static_assert(INT64_MIN >> 16 == INT32_MIN && INT64_MIN >> 24 == INT8_MIN, "expects compiler with twos complement right shift"); and then whoever has a whacky compiler will see the problem and can go fix it.

foad added inline comments.Oct 1 2021, 10:20 AM

llvm/lib/Target/X86/X86InstrInfo.cpp
4567	I was mostly worried a buildbot with the "undefined behaviour" sanitiser complaining about it -- though I'm not actually sure if ubsan detects this by default. I don't really mind what fix you use, if any. The static_assert sounds fine to me. Alternatively you could rewrite the expression, e.g. as `~(INT64_MAX >> Shift)`, but it's much less obvious what it means.

MatzeB updated this revision to Diff 376576.Oct 1 2021, 11:02 AM

ping

Herald added a subscriber: modimo. · View Herald TranscriptOct 13 2021, 5:16 PM

xbolva00 added a subscriber: xbolva00.Oct 13 2021, 5:36 PM

xbolva00 added inline comments.

llvm/test/CodeGen/X86/jump_sign.ll
133 ↗	(On Diff #376576)	cmp now removed

RKSimon added inline comments.Oct 14 2021, 2:51 AM

llvm/lib/Target/X86/X86InstrInfo.cpp
4567	What about using APInt?

MatzeB added inline comments.Oct 14 2021, 10:42 AM

llvm/test/CodeGen/X86/jump_sign.ll
133 ↗	(On Diff #376576)	Good catch. So: I believe the comment to be outdated: When you look at r159888 adding the test: https://github.com/llvm/llvm-project/blob/bb36074047985cb476cadda464447b634972be2a/llvm/test/CodeGen/X86/jump_sign.ll#L110 the intention was that the cmp behind the sub is preserved. The single `CHECK` however doesn't really capture this and can match the `cmp` in bb.0 and the one in bb.1 I believe that SelectionDAG pattern matching improved in the meantime so nowadays the cmp is eliminated at that stage. The new version of the code is correct; you see that previously it used a `cmpl ecx, edx`, but `edx and eax` contain the same value and the sub in `bb.0` the only predecessor of `bb.1` is just the reverse operation and will produce the reverse flags. So we can indeed remove the cmp when we reverse the user flags `cmovle -> cmovl`. I accidentally put this test change onto the wrong commit in the stack. The test actually changes in D110862.

MatzeB added inline comments.Oct 14 2021, 11:01 AM

llvm/lib/Target/X86/X86InstrInfo.cpp
4567	What about using APInt? Seemed a bit overengineered given the value fits neatly in an uint64_t, but sure I will change it to use an APInt then.

MatzeB added inline comments.Oct 14 2021, 11:02 AM

llvm/lib/Target/X86/X86InstrInfo.cpp
4567	Or let me factor out a common helper to shift an uint64_t that we can use in APInt and here.

rebase; use APInt to compute maximum/minimum constants.

Harbormaster completed remote builds in B128943: Diff 379823.Oct 14 2021, 2:09 PM

foad added inline comments.Oct 15 2021, 12:24 AM

llvm/lib/Target/X86/X86InstrInfo.cpp
4566	Surely you need `APInt::getSignedMinValue(BitWidth)`?

MatzeB added inline comments.Oct 15 2021, 11:02 AM

llvm/lib/Target/X86/X86InstrInfo.cpp
4566	Good point. Changed them now. I just realized that technically this check isn't necessary as the tests I added still pass. This is because the `ImmDelta` calculations are actually done on `int64_t` values so they don't wrap around for 8/16/32 bit immediates and instead just overflow to an unrepresentable immediate. We would have overflow problems for 64bit immediates, but x86 doesn't support those. Anyway I think I'll keep the tests here anyway as I like the expressed intent, and if someone ever introduces a 64bit immediates we are prepared :)

MatzeB updated this revision to Diff 380064.Oct 15 2021, 11:03 AM

MatzeB updated this revision to Diff 380065.

Harbormaster completed remote builds in B129101: Diff 380065.Oct 15 2021, 11:46 AM

LGTM - cheers

This revision is now accepted and ready to land.Oct 16 2021, 2:35 AM

Closed by commit rGe2c7ee074359: X86InstrInfo: Support immediates that are +1/-1 different in… (authored by MatzeB). · Explain WhyOct 28 2021, 10:34 AM

This revision was automatically updated to reflect the committed changes.

MatzeB added a commit: rGe2c7ee074359: X86InstrInfo: Support immediates that are +1/-1 different in….

This seems to have caused a miscompile of Chromium, see https://bugs.chromium.org/p/chromium/issues/detail?id=1265339 (there's even a screenshot).

It appears that in a switch statement, one of the cases gets lost.

See https://bugs.chromium.org/p/chromium/issues/detail?id=1265339#c38 for the attached IR, and how the codegen changes with this patch. Sadly it's not reduced, but at least it's stand-alone.

What we've got is something like:

cmp $21, reg
jle foo

foo:
cmp $20, reg
jle bar

bar:
je baz

I suspect what's happening is that this transformation figures the second cmp is redundant if it changes "jle bar" to "jl bar", but it doesn't take into account that the "je baz" was also depending on that second cmp.

Thanks for reporting!

I suspect what's happening is that this transformation figures the second cmp is redundant if it changes "jle bar" to "jl bar", but it doesn't take into account that the "je baz" was also depending on that second cmp.

The optimization should have detected EFLAGS being live out of the bar block in your example and abort because there could potentially be more users (see line 4550).

Is there a way to reproduce this easily?

hans added a reverting change: rGa2a58d91e82d: Revert "X86InstrInfo: Support immediates that are +1/-1 different in….Nov 3 2021, 9:01 AM

Actually I think I see what is going wrong, let me attempt a fix in the next hours.

I've reverted back to green in https://github.com/llvm/llvm-project/commit/a2a58d91e82db38fbdf88cc317dcb3753d79d492 until this can be fixed.

FWIW, we also saw a non-determinism issue as a result of this patch in a stage2 PGO'd build of clang.

MatzeB reopened this revision.Nov 3 2021, 10:22 AM

This revision is now accepted and ready to land.Nov 3 2021, 10:22 AM

Fix EFLAGS live-out not triggering, even though we need to update CC flags for ImmDelta != 0.
While on it rename IsSafe to the more descriptive FlagsMayLiveOut.
Added new tests opt_adjusted_imm_multiple_blocks, opt_adjusted_imm_multiple_blocks_noopt.

I've reverted back to green in https://github.com/llvm/llvm-project/commit/a2a58d91e82db38fbdf88cc317dcb3753d79d492 until this can be fixed.

FWIW, we also saw a non-determinism issue as a result of this patch in a stage2 PGO'd build of clang.

I believe the new version of this diff fixes the reported issue. Do you have a way to confirm?

Harbormaster completed remote builds in B132267: Diff 384505.Nov 3 2021, 11:07 AM

In D110867#3106660, @MatzeB wrote:

I've reverted back to green in https://github.com/llvm/llvm-project/commit/a2a58d91e82db38fbdf88cc317dcb3753d79d492 until this can be fixed.

FWIW, we also saw a non-determinism issue as a result of this patch in a stage2 PGO'd build of clang.

I believe the new version of this diff fixes the reported issue. Do you have a way to confirm?

In theory you should be able to reproduce the non-determinism issue, as it's just clang (nothing internal), but it may depend on some particular workflow we do internally. I'll patch it and report back.

I don't have a way to verify the miscompile Hans is talking about.

I too saw issues with this patch with PGO (I was in the middle of writing up a reproducer when I saw Hans comment). My symptom was that stage 3 check-llvm would not pass, whereas stage 2 would. This version of the patch on top of 7277d2e1c86bf4d75321efc1195d88ade4bedfa1 does not have that same issue.

This version of the patch LGTM with regards to non-determinism

The reproducer I was using looks good with the latest version of the patch. Thanks!

Thanks for confirming! Will re-commit the new version then.

This revision was landed with ongoing or failed builds.Nov 3 2021, 2:13 PM

Closed by commit rG847a6807332b: X86InstrInfo: Support immediates that are +1/-1 different in… (authored by MatzeB). · Explain Why

This revision was automatically updated to reflect the committed changes.

MatzeB added a commit: rG847a6807332b: X86InstrInfo: Support immediates that are +1/-1 different in….

We have root caused to this revision a mis-compile affecting AMD Rome machines.
I have attached a reproducer program that shows the problem when compiled with -O2 or -O3.

Please revert this change until you have a chance to look at the problem.

NOTE: the repro uses the abseil library that can be found at https://github.com/abseil/abseil-cpp. The reproducer might probably be reduced more (if needed) but it would be great if you could revert early.

Reproducer:

repro_git.cc1 KBDownload

In D110867#3182991, @bgraur wrote:

We have root caused to this revision a mis-compile affecting AMD Rome machines.
I have attached a reproducer program that shows the problem when compiled with -O2 or -O3.

Please revert this change until you have a chance to look at the problem.

NOTE: the repro uses the abseil library that can be found at https://github.com/abseil/abseil-cpp. The reproducer might probably be reduced more (if needed) but it would be great if you could revert early.

Reproducer:
repro_git.cc1 KBDownload

Please post standalone reproducer, one that does not require guessing versions of external dependecies,
and the actual reproduction steps, as in the actual complete compilation and run command.

In D110867#3183005, @lebedev.ri wrote:

In D110867#3182991, @bgraur wrote:

We have root caused to this revision a mis-compile affecting AMD Rome machines.
I have attached a reproducer program that shows the problem when compiled with -O2 or -O3.

Please revert this change until you have a chance to look at the problem.

NOTE: the repro uses the abseil library that can be found at https://github.com/abseil/abseil-cpp. The reproducer might probably be reduced more (if needed) but it would be great if you could revert early.

Reproducer:
repro_git.cc1 KBDownload

Please post standalone reproducer, one that does not require guessing versions of external dependecies,
and the actual reproduction steps, as in the actual complete compilation and run command.

Reduced reproducer:

reduced.cc5 KBDownload

clang -O3 -std=gnu++17 -fsized-deallocation -x c++ reduced.cc -lc++abi -pthread -static -o repro

./repro is expected to always return 0.
It will return 0 on intel machines and 186 on amd when built with clang at this revision.
It will return 0 on all machines when built with clang a previous version.

Please revert and let's work together after that if you need more info.

Thanks for investigating and root-causing! I’m currently on vacation, please revert in the meantime.

bgraur added a reverting change: D115528: Revert "X86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr".Dec 10 2021, 8:13 AM

alexfh mentioned this in D115528: Revert "X86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr".Dec 10 2021, 8:16 AM

Thanks! Created a revert here: https://reviews.llvm.org/D115528

In D110867#3185369, @MatzeB wrote:

Thanks for investigating and root-causing! I’m currently on vacation, please revert in the meantime.

There appears to be one change:
https://gcc.godbolt.org/z/TfKrs8M18

llvm13:

shlq $32, %rax
testq %rax, %rax
je .LBB5_11
jle .LBB5_11

(current) trunk:

shlq $32, %rax
je .LBB5_11
jle .LBB5_11

In D110867#3185871, @RKSimon wrote:
In D110867#3185369, @MatzeB wrote:

Thanks for investigating and root-causing! I’m currently on vacation, please revert in the meantime.

There appears to be one change:
https://gcc.godbolt.org/z/TfKrs8M18

llvm13:
shlq $32, %rax
testq %rax, %rax
je .LBB5_11
jle .LBB5_11
(current) trunk:
shlq $32, %rax
je .LBB5_11
jle .LBB5_11

The jle would use the OF flag which isn't defined for shifts of more than 1. And even then doesn't have the same meaning.

An LGTM on the revert would be greatly appreciated.

RKSimon reopened this revision.Dec 10 2021, 10:35 AM

This revision is now accepted and ready to land.Dec 10 2021, 10:35 AM

RKSimon requested changes to this revision.Dec 10 2021, 10:35 AM

This revision now requires changes to proceed.Dec 10 2021, 10:35 AM

bgraur added a reverting change: rGea81cea8163a: Revert "X86InstrInfo: Support immediates that are +1/-1 different in….Dec 10 2021, 2:02 PM

Fix incomplete condition (again) on when to abort on eflags being live-out of the block. This fixes the problem reported by @bgraur

Harbormaster completed remote builds in B141476: Diff 397268.Jan 4 2022, 5:30 AM

LGTM - cheers

This revision is now accepted and ready to land.Jan 10 2022, 9:59 AM

Closing - this was pushed again at rGad25f8a556d239d8b7d17383cf1a0771359521fd

Revision Contents

Path

Size

llvm/

lib/

Target/

X86/

X86InstrInfo.h

3 lines

X86InstrInfo.cpp

92 lines

test/

CodeGen/

X86/

optimize-compare.mir

358 lines

peep-test-5.ll

56 lines

use-cr-result-of-dom-icmp-st.ll

12 lines

Diff 397268

llvm/lib/Target/X86/X86InstrInfo.h

Show First 20 Lines • Show All 637 Lines • ▼ Show 20 Lines	private:
///		///
/// Examples of OI, FlagI pairs returning true:		/// Examples of OI, FlagI pairs returning true:
/// CMP %1, 42 and CMP %1, 42		/// CMP %1, 42 and CMP %1, 42
/// CMP %1, %2 and %3 = SUB %1, %2		/// CMP %1, %2 and %3 = SUB %1, %2
/// TEST %1, %1 and %2 = SUB %1, 0		/// TEST %1, %1 and %2 = SUB %1, 0
/// CMP %1, %2 and %3 = SUB %2, %1 ; IsSwapped=true		/// CMP %1, %2 and %3 = SUB %2, %1 ; IsSwapped=true
bool isRedundantFlagInstr(const MachineInstr &FlagI, Register SrcReg,		bool isRedundantFlagInstr(const MachineInstr &FlagI, Register SrcReg,
Register SrcReg2, int64_t ImmMask, int64_t ImmValue,		Register SrcReg2, int64_t ImmMask, int64_t ImmValue,
const MachineInstr &OI, bool *IsSwapped) const;		const MachineInstr &OI, bool *IsSwapped,
		int64_t *ImmDelta) const;
};		};

} // namespace llvm		} // namespace llvm

#endif		#endif

llvm/lib/Target/X86/X86InstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,082 Lines • ▼ Show 20 Lines	case X86::TEST64rr:
return true;		return true;
}		}
return false;		return false;
}		}

bool X86InstrInfo::isRedundantFlagInstr(const MachineInstr &FlagI,		bool X86InstrInfo::isRedundantFlagInstr(const MachineInstr &FlagI,
Register SrcReg, Register SrcReg2,		Register SrcReg, Register SrcReg2,
int64_t ImmMask, int64_t ImmValue,		int64_t ImmMask, int64_t ImmValue,
const MachineInstr &OI,		const MachineInstr &OI, bool *IsSwapped,
bool *IsSwapped) const {		int64_t *ImmDelta) const {
switch (OI.getOpcode()) {		switch (OI.getOpcode()) {
case X86::CMP64rr:		case X86::CMP64rr:
case X86::CMP32rr:		case X86::CMP32rr:
case X86::CMP16rr:		case X86::CMP16rr:
case X86::CMP8rr:		case X86::CMP8rr:
case X86::SUB64rr:		case X86::SUB64rr:
case X86::SUB32rr:		case X86::SUB32rr:
case X86::SUB16rr:		case X86::SUB16rr:
Show All 34 Lines	bool X86InstrInfo::isRedundantFlagInstr(const MachineInstr &FlagI,
case X86::TEST16rr:		case X86::TEST16rr:
case X86::TEST8rr: {		case X86::TEST8rr: {
if (ImmMask != 0) {		if (ImmMask != 0) {
Register OISrcReg;		Register OISrcReg;
Register OISrcReg2;		Register OISrcReg2;
int64_t OIMask;		int64_t OIMask;
int64_t OIValue;		int64_t OIValue;
if (analyzeCompare(OI, OISrcReg, OISrcReg2, OIMask, OIValue) &&		if (analyzeCompare(OI, OISrcReg, OISrcReg2, OIMask, OIValue) &&
SrcReg == OISrcReg && ImmMask == OIMask && OIValue == ImmValue) {		SrcReg == OISrcReg && ImmMask == OIMask) {
assert(SrcReg2 == X86::NoRegister && OISrcReg2 == X86::NoRegister &&		if (OIValue == ImmValue) {
"should not have 2nd register");		*ImmDelta = 0;
		return true;
		} else if (static_cast<uint64_t>(ImmValue) ==
		static_cast<uint64_t>(OIValue) - 1) {
		*ImmDelta = -1;
		return true;
		} else if (static_cast<uint64_t>(ImmValue) ==
		static_cast<uint64_t>(OIValue) + 1) {
		*ImmDelta = 1;
return true;		return true;
		} else {
		return false;
		}
}		}
}		}
return FlagI.isIdenticalTo(OI);		return FlagI.isIdenticalTo(OI);
}		}
default:		default:
return false;		return false;
}		}
}		}
▲ Show 20 Lines • Show All 233 Lines • ▼ Show 20 Lines	bool X86InstrInfo::optimizeCompareInstr(MachineInstr &CmpInstr, Register SrcReg,
MachineInstr *MI = nullptr;		MachineInstr *MI = nullptr;
MachineInstr *Sub = nullptr;		MachineInstr *Sub = nullptr;
MachineInstr *Movr0Inst = nullptr;		MachineInstr *Movr0Inst = nullptr;
bool NoSignFlag = false;		bool NoSignFlag = false;
bool ClearsOverflowFlag = false;		bool ClearsOverflowFlag = false;
bool ShouldUpdateCC = false;		bool ShouldUpdateCC = false;
bool IsSwapped = false;		bool IsSwapped = false;
X86::CondCode NewCC = X86::COND_INVALID;		X86::CondCode NewCC = X86::COND_INVALID;
		int64_t ImmDelta = 0;

// Search backward from CmpInstr for the next instruction defining EFLAGS.		// Search backward from CmpInstr for the next instruction defining EFLAGS.
const TargetRegisterInfo *TRI = &getRegisterInfo();		const TargetRegisterInfo *TRI = &getRegisterInfo();
MachineBasicBlock &CmpMBB = *CmpInstr.getParent();		MachineBasicBlock &CmpMBB = *CmpInstr.getParent();
MachineBasicBlock::reverse_iterator From =		MachineBasicBlock::reverse_iterator From =
std::next(MachineBasicBlock::reverse_iterator(CmpInstr));		std::next(MachineBasicBlock::reverse_iterator(CmpInstr));
for (MachineBasicBlock *MBB = &CmpMBB;;) {		for (MachineBasicBlock *MBB = &CmpMBB;;) {
for (MachineInstr &Inst : make_range(From, MBB->rend())) {		for (MachineInstr &Inst : make_range(From, MBB->rend())) {
Show All 30 Lines	for (MachineInstr &Inst : make_range(From, MBB->rend())) {
}		}

// Try to use EFLAGS from an instruction with similar flag results.		// Try to use EFLAGS from an instruction with similar flag results.
// Example:		// Example:
// sub x, y or cmp x, y		// sub x, y or cmp x, y
// ... // EFLAGS not changed		// ... // EFLAGS not changed
// cmp x, y // <-- can be removed		// cmp x, y // <-- can be removed
if (isRedundantFlagInstr(CmpInstr, SrcReg, SrcReg2, CmpMask, CmpValue,		if (isRedundantFlagInstr(CmpInstr, SrcReg, SrcReg2, CmpMask, CmpValue,
Inst, &IsSwapped)) {		Inst, &IsSwapped, &ImmDelta)) {
Sub = &Inst;		Sub = &Inst;
break;		break;
}		}

// MOV32r0 is implemented with xor which clobbers condition code. It is		// MOV32r0 is implemented with xor which clobbers condition code. It is
// safe to move up, if the definition to EFLAGS is dead and earlier		// safe to move up, if the definition to EFLAGS is dead and earlier
// instructions do not read or write EFLAGS.		// instructions do not read or write EFLAGS.
if (!Movr0Inst && Inst.getOpcode() == X86::MOV32r0 &&		if (!Movr0Inst && Inst.getOpcode() == X86::MOV32r0 &&
Show All 17 Lines	for (MachineBasicBlock *MBB = &CmpMBB;;) {
MBB = *MBB->pred_begin();		MBB = *MBB->pred_begin();
From = MBB->rbegin();		From = MBB->rbegin();
}		}

// Scan forward from the instruction after CmpInstr for uses of EFLAGS.		// Scan forward from the instruction after CmpInstr for uses of EFLAGS.
// It is safe to remove CmpInstr if EFLAGS is redefined or killed.		// It is safe to remove CmpInstr if EFLAGS is redefined or killed.
// If we are done with the basic block, we need to check whether EFLAGS is		// If we are done with the basic block, we need to check whether EFLAGS is
// live-out.		// live-out.
bool IsSafe = false;		bool FlagsMayLiveOut = true;
SmallVector<std::pair<MachineInstr*, X86::CondCode>, 4> OpsToUpdate;		SmallVector<std::pair<MachineInstr*, X86::CondCode>, 4> OpsToUpdate;
MachineBasicBlock::iterator AfterCmpInstr =		MachineBasicBlock::iterator AfterCmpInstr =
std::next(MachineBasicBlock::iterator(CmpInstr));		std::next(MachineBasicBlock::iterator(CmpInstr));
for (MachineInstr &Instr : make_range(AfterCmpInstr, CmpMBB.end())) {		for (MachineInstr &Instr : make_range(AfterCmpInstr, CmpMBB.end())) {
bool ModifyEFLAGS = Instr.modifiesRegister(X86::EFLAGS, TRI);		bool ModifyEFLAGS = Instr.modifiesRegister(X86::EFLAGS, TRI);
bool UseEFLAGS = Instr.readsRegister(X86::EFLAGS, TRI);		bool UseEFLAGS = Instr.readsRegister(X86::EFLAGS, TRI);
// We should check the usage if this instruction uses and updates EFLAGS.		// We should check the usage if this instruction uses and updates EFLAGS.
if (!UseEFLAGS && ModifyEFLAGS) {		if (!UseEFLAGS && ModifyEFLAGS) {
// It is safe to remove CmpInstr if EFLAGS is updated again.		// It is safe to remove CmpInstr if EFLAGS is updated again.
IsSafe = true;		FlagsMayLiveOut = false;
break;		break;
}		}
if (!UseEFLAGS && !ModifyEFLAGS)		if (!UseEFLAGS && !ModifyEFLAGS)
continue;		continue;

// EFLAGS is used by this instruction.		// EFLAGS is used by this instruction.
X86::CondCode OldCC = X86::COND_INVALID;		X86::CondCode OldCC = X86::COND_INVALID;
if (MI \|\| IsSwapped) {		if (MI \|\| IsSwapped \|\| ImmDelta != 0) {
// We decode the condition code from opcode.		// We decode the condition code from opcode.
if (Instr.isBranch())		if (Instr.isBranch())
OldCC = X86::getCondFromBranch(Instr);		OldCC = X86::getCondFromBranch(Instr);
else {		else {
OldCC = X86::getCondFromSETCC(Instr);		OldCC = X86::getCondFromSETCC(Instr);
if (OldCC == X86::COND_INVALID)		if (OldCC == X86::COND_INVALID)
OldCC = X86::getCondFromCMov(Instr);		OldCC = X86::getCondFromCMov(Instr);
}		}
Show All 37 Lines	if (MI) {
}		}
} else if (IsSwapped) {		} else if (IsSwapped) {
// If we have SUB(r1, r2) and CMP(r2, r1), the condition code needs		// If we have SUB(r1, r2) and CMP(r2, r1), the condition code needs
// to be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 > r2, etc.		// to be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 > r2, etc.
// We swap the condition code and synthesize the new opcode.		// We swap the condition code and synthesize the new opcode.
ReplacementCC = getSwappedCondition(OldCC);		ReplacementCC = getSwappedCondition(OldCC);
if (ReplacementCC == X86::COND_INVALID)		if (ReplacementCC == X86::COND_INVALID)
return false;		return false;
		ShouldUpdateCC = true;
		} else if (ImmDelta != 0) {
		unsigned BitWidth = TRI->getRegSizeInBits(*MRI->getRegClass(SrcReg));
		// Shift amount for min/max constants to adjust for 8/16/32 instruction
		// sizes.
		switch (OldCC) {
		case X86::COND_L: // x <s (C + 1) --> x <=s C
		foadUnsubmitted Not Done Reply Inline Actions Surely you need `APInt::getSignedMinValue(BitWidth)`? foad: Surely you need `APInt::getSignedMinValue(BitWidth)`?
		MatzeBAuthorUnsubmitted Done Reply Inline Actions Good point. Changed them now. I just realized that technically this check isn't necessary as the tests I added still pass. This is because the `ImmDelta` calculations are actually done on `int64_t` values so they don't wrap around for 8/16/32 bit immediates and instead just overflow to an unrepresentable immediate. We would have overflow problems for 64bit immediates, but x86 doesn't support those. Anyway I think I'll keep the tests here anyway as I like the expressed intent, and if someone ever introduces a 64bit immediates we are prepared :) MatzeB: Good point. Changed them now. I just realized that technically this check isn't necessary as…
		if (ImmDelta != 1 \|\| APInt::getSignedMinValue(BitWidth) == CmpValue)
		foadUnsubmitted Done Reply Inline Actions Is it OK to use `INT64_MIN >> Shift` here? I think the standard says right shift of a negative value is implementation-defined. foad: Is it OK to use `INT64_MIN >> Shift` here? I think the standard says right shift of a negative…
		MatzeBAuthorUnsubmitted Done Reply Inline Actions Good question! I think we can rely on two-complement arithmetic shift behavior here. I'm not aware of any compiler behaving differently, certainly not the one ones specified to be used for LLVM in the documentation: https://llvm.org/docs/GettingStarted.html#host-c-toolchain-both-compiler-and-standard-library Other code like APInt seems to rely on this as well: https://github.com/llvm/llvm-project/blob/4f0225f6d21b601d62b73dce913bf59d8fb93d87/llvm/include/llvm/ADT/APInt.h#L796 That said if you have a simple alternative to this, I'd be happy to change the code... MatzeB: Good question! I think we can rely on two-complement arithmetic shift behavior here. I'm not…
		MatzeBAuthorUnsubmitted Done Reply Inline Actions How about I add this: static_assert(INT64_MIN >> 16 == INT32_MIN && INT64_MIN >> 24 == INT8_MIN, "expects compiler with twos complement right shift"); and then whoever has a whacky compiler will see the problem and can go fix it. MatzeB: How about I add this: ``` static_assert(INT64_MIN >> 16 == INT32_MIN && INT64_MIN >> 24 ==…
		foadUnsubmitted Not Done Reply Inline Actions I was mostly worried a buildbot with the "undefined behaviour" sanitiser complaining about it -- though I'm not actually sure if ubsan detects this by default. I don't really mind what fix you use, if any. The static_assert sounds fine to me. Alternatively you could rewrite the expression, e.g. as `~(INT64_MAX >> Shift)`, but it's much less obvious what it means. foad: I was mostly worried a buildbot with the "undefined behaviour" sanitiser complaining about it…
		RKSimonUnsubmitted Not Done Reply Inline Actions What about using APInt? RKSimon: What about using APInt?
		MatzeBAuthorUnsubmitted Done Reply Inline Actions What about using APInt? Seemed a bit overengineered given the value fits neatly in an uint64_t, but sure I will change it to use an APInt then. MatzeB: > What about using APInt? Seemed a bit overengineered given the value fits neatly in an…
		MatzeBAuthorUnsubmitted Done Reply Inline Actions Or let me factor out a common helper to shift an uint64_t that we can use in APInt and here. MatzeB: Or let me factor out a common helper to shift an uint64_t that we can use in APInt and here.
		return false;
		ReplacementCC = X86::COND_LE;
		break;
		case X86::COND_B: // x <u (C + 1) --> x <=u C
		if (ImmDelta != 1 \|\| CmpValue == 0)
		return false;
		ReplacementCC = X86::COND_BE;
		break;
		case X86::COND_GE: // x >=s (C + 1) --> x >s C
		if (ImmDelta != 1 \|\| APInt::getSignedMinValue(BitWidth) == CmpValue)
		return false;
		ReplacementCC = X86::COND_G;
		break;
		case X86::COND_AE: // x >=u (C + 1) --> x >u C
		if (ImmDelta != 1 \|\| CmpValue == 0)
		return false;
		ReplacementCC = X86::COND_A;
		break;
		case X86::COND_G: // x >s (C - 1) --> x >=s C
		if (ImmDelta != -1 \|\| APInt::getSignedMaxValue(BitWidth) == CmpValue)
		return false;
		ReplacementCC = X86::COND_GE;
		break;
		case X86::COND_A: // x >u (C - 1) --> x >=u C
		if (ImmDelta != -1 \|\| APInt::getMaxValue(BitWidth) == CmpValue)
		return false;
		ReplacementCC = X86::COND_AE;
		break;
		case X86::COND_LE: // x <=s (C - 1) --> x <s C
		if (ImmDelta != -1 \|\| APInt::getSignedMaxValue(BitWidth) == CmpValue)
		return false;
		ReplacementCC = X86::COND_L;
		break;
		case X86::COND_BE: // x <=u (C - 1) --> x <u C
		if (ImmDelta != -1 \|\| APInt::getMaxValue(BitWidth) == CmpValue)
		return false;
		ReplacementCC = X86::COND_B;
		break;
		default:
		return false;
		}
		ShouldUpdateCC = true;
}		}

if ((ShouldUpdateCC \|\| IsSwapped) && ReplacementCC != OldCC) {		if (ShouldUpdateCC && ReplacementCC != OldCC) {
// Push the MachineInstr to OpsToUpdate.		// Push the MachineInstr to OpsToUpdate.
// If it is safe to remove CmpInstr, the condition code of these		// If it is safe to remove CmpInstr, the condition code of these
// instructions will be modified.		// instructions will be modified.
OpsToUpdate.push_back(std::make_pair(&Instr, ReplacementCC));		OpsToUpdate.push_back(std::make_pair(&Instr, ReplacementCC));
}		}
if (ModifyEFLAGS \|\| Instr.killsRegister(X86::EFLAGS, TRI)) {		if (ModifyEFLAGS \|\| Instr.killsRegister(X86::EFLAGS, TRI)) {
// It is safe to remove CmpInstr if EFLAGS is updated again or killed.		// It is safe to remove CmpInstr if EFLAGS is updated again or killed.
IsSafe = true;		FlagsMayLiveOut = false;
break;		break;
}		}
}		}

// If EFLAGS is not killed nor re-defined, we should check whether it is		// If we have to update users but EFLAGS is live-out abort, since we cannot
// live-out. If it is live-out, do not optimize.		// easily find all of the users.
if ((MI \|\| IsSwapped) && !IsSafe) {		if ((MI != nullptr \|\| ShouldUpdateCC) && FlagsMayLiveOut) {
for (MachineBasicBlock *Successor : CmpMBB.successors())		for (MachineBasicBlock *Successor : CmpMBB.successors())
if (Successor->isLiveIn(X86::EFLAGS))		if (Successor->isLiveIn(X86::EFLAGS))
return false;		return false;
}		}

// The instruction to be updated is either Sub or MI.		// The instruction to be updated is either Sub or MI.
assert((MI == nullptr \|\| Sub == nullptr) && "Should not have Sub and MI set");		assert((MI == nullptr \|\| Sub == nullptr) && "Should not have Sub and MI set");
Sub = MI != nullptr ? MI : Sub;		Sub = MI != nullptr ? MI : Sub;
▲ Show 20 Lines • Show All 4,826 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/optimize-compare.mir

Show First 20 Lines • Show All 373 Lines • ▼ Show 20 Lines	bb.0:
; CHECK-NEXT: $cl = SETCCr 3, implicit $eflags		; CHECK-NEXT: $cl = SETCCr 3, implicit $eflags
%0:gr64 = COPY $rsi		%0:gr64 = COPY $rsi
CMP64ri32 %0, @opt_redundant_flags_cmp_addr_noopt + 24, implicit-def $eflags		CMP64ri32 %0, @opt_redundant_flags_cmp_addr_noopt + 24, implicit-def $eflags
$cl = SETCCr 7, implicit $eflags		$cl = SETCCr 7, implicit $eflags
; CMP should not be removed		; CMP should not be removed
CMP64ri32 %0, 24, implicit-def $eflags		CMP64ri32 %0, 24, implicit-def $eflags
$cl = SETCCr 3, implicit $eflags		$cl = SETCCr 3, implicit $eflags
...		...
		---
		name: opt_redundant_flags_adjusted_imm_0
		body: \|
		bb.0:
		; CHECK-LABEL: name: opt_redundant_flags_adjusted_imm_0
		; CHECK: [[COPY:%[0-9]+]]:gr64 = COPY $rsi
		; CHECK-NEXT: CMP64ri8 [[COPY]], 1, implicit-def $eflags
		; CHECK-NEXT: $cl = SETCCr 4, implicit $eflags
		; CHECK-NEXT: $bl = SETCCr 15, implicit $eflags
		; CHECK-NEXT: $bl = SETCCr 7, implicit $eflags
		; CHECK-NEXT: $bl = SETCCr 14, implicit $eflags
		; CHECK-NEXT: $bl = SETCCr 6, implicit $eflags
		%0:gr64 = COPY $rsi
		; CMP+SETCC %0 == 1
		CMP64ri8 %0, 1, implicit-def $eflags
		$cl = SETCCr 4, implicit $eflags
		; CMP+SETCC %0 >= 2; CMP can be removed.
		CMP64ri8 %0, 2, implicit-def $eflags
		; %0 >=s 2 --> %0 >s 1
		$bl = SETCCr 13, implicit $eflags
		; %0 >=u 2 --> %0 >u 1
		$bl = SETCCr 3, implicit $eflags
		; %0 <s 2 --> %0 <=s 1
		$bl = SETCCr 12, implicit $eflags
		; %0 <u 2 --> %0 <=u 1
		$bl = SETCCr 2, implicit $eflags
		...
		---
		name: opt_redundant_flags_adjusted_imm_1
		body: \|
		bb.0:
		; CHECK-LABEL: name: opt_redundant_flags_adjusted_imm_1
		; CHECK: [[COPY:%[0-9]+]]:gr64 = COPY $rsi
		; CHECK-NEXT: CMP64ri8 [[COPY]], 42, implicit-def $eflags
		; CHECK-NEXT: $cl = SETCCr 5, implicit $eflags
		; CHECK-NEXT: $bl = SETCCr 13, implicit $eflags
		; CHECK-NEXT: $bl = SETCCr 3, implicit $eflags
		; CHECK-NEXT: $bl = SETCCr 12, implicit $eflags
		; CHECK-NEXT: $bl = SETCCr 2, implicit $eflags
		%0:gr64 = COPY $rsi
		; CMP+SETCC %0 != 42
		CMP64ri8 %0, 42, implicit-def $eflags
		$cl = SETCCr 5, implicit $eflags
		; CMP+SETCC %0 >= 2; CMP can be removed.
		CMP64ri8 %0, 41, implicit-def $eflags
		; %0 >s 41 --> %0 >=s 42
		$bl = SETCCr 15, implicit $eflags
		; %0 >u 41 --> %0 >=u 42
		$bl = SETCCr 7, implicit $eflags
		; %0 <=s 41 --> %0 <s 42
		$bl = SETCCr 14, implicit $eflags
		; %0 <=u 41 --> %0 <u 42
		$bl = SETCCr 6, implicit $eflags
		...
		---
		name: opt_redundant_flags_adjusted_imm_test_cmp
		body: \|
		bb.0:
		; CHECK-LABEL: name: opt_redundant_flags_adjusted_imm_test_cmp
		; CHECK: [[COPY:%[0-9]+]]:gr8 = COPY $bl
		; CHECK-NEXT: TEST8rr [[COPY]], [[COPY]], implicit-def $eflags
		; CHECK-NEXT: $cl = SETCCr 14, implicit $eflags
		; CHECK-NEXT: $cl = SETCCr 7, implicit $eflags
		; CHECK-NEXT: $cl = SETCCr 12, implicit $eflags
		%0:gr8 = COPY $bl
		TEST8rr %0, %0, implicit-def $eflags
		; SET %0 <=s 0
		$cl = SETCCr 14, implicit $eflags
		; CMP should be removed (%0 >=u 1)
		CMP8ri %0, 1, implicit-def $eflags
		$cl = SETCCr 3, implicit $eflags

		; CMP should be removed (%0 <=s -1)
		CMP8ri %0, -1, implicit-def $eflags
		$cl = SETCCr 14, implicit $eflags
		...
		---
		name: opt_redundant_flags_adjusted_imm_cmp_test
		body: \|
		bb.0:
		; CHECK-LABEL: name: opt_redundant_flags_adjusted_imm_cmp_test
		; CHECK: [[COPY:%[0-9]+]]:gr64 = COPY $rsi
		; CHECK-NEXT: CMP64ri32 [[COPY]], 1, implicit-def $eflags
		; CHECK-NEXT: $cl = SETCCr 13, implicit $eflags
		; CHECK-NEXT: [[COPY1:%[0-9]+]]:gr64 = COPY $edi
		; CHECK-NEXT: CMP64ri32 [[COPY1]], -1, implicit-def $eflags
		; CHECK-NEXT: $cl = SETCCr 14, implicit $eflags
		%0:gr64 = COPY $rsi
		CMP64ri32 %0, 1, implicit-def $eflags
		; TEST should be removed
		TEST64rr %0, %0, implicit-def $eflags
		$cl = SETCCr 15, implicit $eflags

		%1:gr64 = COPY $edi
		CMP64ri32 %1, -1, implicit-def $eflags
		; TEST should be removed
		TEST64rr %1, %1, implicit-def $eflags
		$cl = SETCCr 12, implicit $eflags
		...
		---
		name: opt_redundant_flags_adjusted_imm_noopt_0
		body: \|
		bb.0:
		; CHECK-LABEL: name: opt_redundant_flags_adjusted_imm_noopt_0
		; CHECK: [[COPY:%[0-9]+]]:gr64 = COPY $rsi
		; CHECK-NEXT: CMP64ri8 [[COPY]], 42, implicit-def $eflags
		; CHECK-NEXT: $cl = SETCCr 4, implicit $eflags
		; CHECK-NEXT: CMP64ri8 [[COPY]], 41, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 4, implicit $eflags
		%0:gr64 = COPY $rsi
		; CMP+SETCC %0 <s 1
		CMP64ri8 %0, 42, implicit-def $eflags
		$cl = SETCCr 4, implicit $eflags
		; CMP should not be removed.
		CMP64ri8 %0, 41, implicit-def $eflags
		; %0 == 41
		$bl = SETCCr 4, implicit $eflags
		...
		---
		name: opt_redundant_flags_adjusted_imm_noopt_1
		body: \|
		bb.0:
		; CHECK-LABEL: name: opt_redundant_flags_adjusted_imm_noopt_1
		; CHECK: [[COPY:%[0-9]+]]:gr32 = COPY $esi
		; CHECK-NEXT: CMP32ri [[COPY]], 2147483647, implicit-def $eflags
		; CHECK-NEXT: CMP32ri [[COPY]], -2147483648, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 12, implicit $eflags
		; CHECK-NEXT: CMP32ri [[COPY]], 4294967295, implicit-def $eflags
		; CHECK-NEXT: CMP32ri [[COPY]], -2147483648, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 12, implicit $eflags
		; CHECK-NEXT: CMP32ri [[COPY]], 2147483647, implicit-def $eflags
		; CHECK-NEXT: CMP32ri [[COPY]], -2147483648, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 13, implicit $eflags
		; CHECK-NEXT: CMP32ri [[COPY]], 4294967295, implicit-def $eflags
		; CHECK-NEXT: CMP32ri [[COPY]], 0, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 2, implicit $eflags
		; CHECK-NEXT: CMP32ri [[COPY]], 4294967295, implicit-def $eflags
		; CHECK-NEXT: CMP32ri [[COPY]], 0, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 3, implicit $eflags
		%0:gr32 = COPY $esi
		; CMP+SETCC %0 == INT32_MAX
		CMP32ri %0, 2147483647, implicit-def $eflags
		; CMP should not be removed.
		CMP32ri %0, -2147483648, implicit-def $eflags
		; %0 <s INT32_MIN
		$bl = SETCCr 12, implicit $eflags

		CMP32ri %0, 4294967295, implicit-def $eflags
		; CMP should not be removed.
		CMP32ri %0, -2147483648, implicit-def $eflags
		$bl = SETCCr 12, implicit $eflags

		CMP32ri %0, 2147483647, implicit-def $eflags
		; CMP should not be removed.
		CMP32ri %0, -2147483648, implicit-def $eflags
		$bl = SETCCr 13, implicit $eflags

		CMP32ri %0, 4294967295, implicit-def $eflags
		; should not be removed
		CMP32ri %0, 0, implicit-def $eflags
		$bl = SETCCr 2, implicit $eflags

		CMP32ri %0, 4294967295, implicit-def $eflags
		; should not be removed
		CMP32ri %0, 0, implicit-def $eflags
		$bl = SETCCr 3, implicit $eflags
		...
		---
		name: opt_redundant_flags_adjusted_imm_noopt_2
		body: \|
		bb.0:
		; CHECK-LABEL: name: opt_redundant_flags_adjusted_imm_noopt_2
		; CHECK: [[COPY:%[0-9]+]]:gr16 = COPY $cx
		; CHECK-NEXT: CMP16ri [[COPY]], -32768, implicit-def $eflags
		; CHECK-NEXT: CMP16ri [[COPY]], 32767, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 15, implicit $eflags
		; CHECK-NEXT: CMP16ri [[COPY]], 65535, implicit-def $eflags
		; CHECK-NEXT: CMP16ri [[COPY]], 32767, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 15, implicit $eflags
		; CHECK-NEXT: CMP16ri [[COPY]], -32768, implicit-def $eflags
		; CHECK-NEXT: CMP16ri [[COPY]], 32767, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 14, implicit $eflags
		; CHECK-NEXT: CMP16ri [[COPY]], 0, implicit-def $eflags
		; CHECK-NEXT: CMP16ri [[COPY]], 65535, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 4, implicit $eflags
		; CHECK-NEXT: CMP16ri [[COPY]], 0, implicit-def $eflags
		; CHECK-NEXT: CMP16ri [[COPY]], 65535, implicit-def $eflags
		; CHECK-NEXT: $bl = SETCCr 6, implicit $eflags
		%0:gr16 = COPY $cx
		; CMP+SETCC %0 == INT16_MIN
		CMP16ri %0, -32768, implicit-def $eflags
		; CMP should not be removed.
		CMP16ri %0, 32767, implicit-def $eflags
		; %0 >s INT16_MAX
		$bl = SETCCr 15, implicit $eflags

		CMP16ri %0, 65535, implicit-def $eflags
		; CMP should not be removed.
		CMP16ri %0, 32767, implicit-def $eflags
		$bl = SETCCr 15, implicit $eflags

		CMP16ri %0, -32768, implicit-def $eflags
		; CMP should not be removed.
		CMP16ri %0, 32767, implicit-def $eflags
		$bl = SETCCr 14, implicit $eflags

		CMP16ri %0, 0, implicit-def $eflags
		; should not be removed
		CMP16ri %0, 65535, implicit-def $eflags
		$bl = SETCCr 4, implicit $eflags

		CMP16ri %0, 0, implicit-def $eflags
		; should not be removed
		CMP16ri %0, 65535, implicit-def $eflags
		$bl = SETCCr 6, implicit $eflags
		...
		---
		name: opt_adjusted_imm_multiple_blocks
		body: \|
		; CHECK-LABEL: name: opt_adjusted_imm_multiple_blocks
		; CHECK: bb.0:
		; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.3(0x40000000)
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: [[COPY:%[0-9]+]]:gr32 = COPY $eax
		; CHECK-NEXT: CMP32ri [[COPY]], 20, implicit-def $eflags
		; CHECK-NEXT: JCC_1 %bb.1, 4, implicit $eflags
		; CHECK-NEXT: JMP_1 %bb.3
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: bb.1:
		; CHECK-NEXT: successors: %bb.2(0x40000000), %bb.3(0x40000000)
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: JCC_1 %bb.2, 15, implicit $eflags
		; CHECK-NEXT: JMP_1 %bb.3
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: bb.2:
		; CHECK-NEXT: successors: %bb.3(0x80000000)
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: JMP_1 %bb.3
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: bb.3:
		; CHECK-NEXT: RET 0
		bb.0:
		%0:gr32 = COPY $eax
		CMP32ri %0, 20, implicit-def $eflags
		JCC_1 %bb.1, 4, implicit $eflags
		JMP_1 %bb.3

		bb.1:
		; CMP can be removed when adjusting the JCC.
		CMP32ri %0, 21, implicit-def $eflags
		JCC_1 %bb.2, 13, implicit $eflags
		JMP_1 %bb.3

		bb.2:
		JMP_1 %bb.3

		bb.3:
		RET 0
		...
		---
		name: opt_adjusted_imm_multiple_blocks_noopt
		body: \|
		; CHECK-LABEL: name: opt_adjusted_imm_multiple_blocks_noopt
		; CHECK: bb.0:
		; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.3(0x40000000)
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: [[COPY:%[0-9]+]]:gr32 = COPY $eax
		; CHECK-NEXT: CMP32ri [[COPY]], 20, implicit-def $eflags
		; CHECK-NEXT: JCC_1 %bb.1, 4, implicit $eflags
		; CHECK-NEXT: JMP_1 %bb.3
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: bb.1:
		; CHECK-NEXT: successors: %bb.2(0x40000000), %bb.3(0x40000000)
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: CMP32ri [[COPY]], 21, implicit-def $eflags
		; CHECK-NEXT: JCC_1 %bb.2, 13, implicit $eflags
		; CHECK-NEXT: JMP_1 %bb.3
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: bb.2:
		; CHECK-NEXT: successors: %bb.3(0x80000000)
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: $al = SETCCr 4, implicit $eflags
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: bb.3:
		; CHECK-NEXT: RET 0
		bb.0:
		%0:gr32 = COPY $eax
		CMP32ri %0, 20, implicit-def $eflags
		JCC_1 %bb.1, 4, implicit $eflags
		JMP_1 %bb.3

		bb.1:
		; The following CMP should not be optimized because $eflags is live-out
		CMP32ri %0, 21, implicit-def $eflags
		JCC_1 %bb.2, 13, implicit $eflags
		JMP_1 %bb.3

		bb.2:
		liveins: $eflags
		$al = SETCCr 4, implicit $eflags

		bb.3:
		RET 0
		...
		---
		name: opt_shift_cmp_zero
		body: \|
		bb.0:
		; CHECK-LABEL: name: opt_shift_cmp_zero
		; CHECK: [[COPY:%[0-9]+]]:gr64 = COPY $rsi
		; CHECK-NEXT: [[SHL64ri:%[0-9]+]]:gr64 = SHL64ri [[COPY]], 7, implicit-def $eflags
		; CHECK-NEXT: $al = SETCCr 4, implicit $eflags
		%0:gr64 = COPY $rsi
		%1:gr64 = SHL64ri %0, 7, implicit-def dead $eflags
		; TEST should be removed.
		TEST64rr %1, %1, implicit-def $eflags
		$al = SETCCr 4, implicit $eflags
		...
		---
		name: noopt_shift_cmp_zero
		body: \|
		bb.0:
		; CHECK-LABEL: name: noopt_shift_cmp_zero
		; CHECK: [[COPY:%[0-9]+]]:gr64 = COPY $rsi
		; CHECK-NEXT: [[SHL64ri:%[0-9]+]]:gr64 = SHL64ri [[COPY]], 9, implicit-def dead $eflags
		; CHECK-NEXT: TEST64rr [[SHL64ri]], [[SHL64ri]], implicit-def $eflags
		; CHECK-NEXT: $al = SETCCr 14, implicit $eflags
		%0:gr64 = COPY $rsi
		%1:gr64 = SHL64ri %0, 9, implicit-def dead $eflags
		; TEST cannot be removed if a user relies on the OF flag.
		TEST64rr %1, %1, implicit-def $eflags
		$al = SETCCr 14, implicit $eflags
		...
		---
		name: noopt_shift_cmp_zero_multiblock
		body: \|
		; CHECK-LABEL: name: noopt_shift_cmp_zero_multiblock
		; CHECK: bb.0:
		; CHECK-NEXT: successors: %bb.1(0x80000000)
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: [[COPY:%[0-9]+]]:gr64 = COPY $rsi
		; CHECK-NEXT: [[SHL64ri:%[0-9]+]]:gr64 = SHL64ri [[COPY]], 9, implicit-def dead $eflags
		; CHECK-NEXT: TEST64rr [[SHL64ri]], [[SHL64ri]], implicit-def $eflags
		; CHECK-NEXT: JMP_1 %bb.1
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: bb.1:
		; CHECK-NEXT: $al = SETCCr 14, implicit $eflags
		bb.0:
		%0:gr64 = COPY $rsi
		%1:gr64 = SHL64ri %0, 9, implicit-def dead $eflags
		; TEST cannot be removed if a user relies on the OF flag.
		TEST64rr %1, %1, implicit-def $eflags
		JMP_1 %bb.1

		bb.1:
		liveins: $eflags
		$al = SETCCr 14, implicit $eflags
		...

llvm/test/CodeGen/X86/peep-test-5.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -o - %s -mtriple=x86_64-- \| FileCheck %s
				; Example of a decref operation with "immortal" objects.
				; void decref(long* refcount) {
				; long count = *refcount;
				; if (count == 1) { free_object() }
				; else if (count > 1) { *refcount = count - 1; }
				; else { /* immortal */ }
				; }
				; Resulting assembly should share flags from single CMP instruction for both
				; conditions!
				define void @decref(i32* %p) {
				; CHECK-LABEL: decref:
				; CHECK: # %bb.0:
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movl (%rdi), %eax
				; CHECK-NEXT: cmpl $1, %eax
				; CHECK-NEXT: jne .LBB0_2
				; CHECK-NEXT: # %bb.1: # %bb_free
				; CHECK-NEXT: callq free_object@PLT
				; CHECK-NEXT: .LBB0_4: # %end
				; CHECK-NEXT: popq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				; CHECK-NEXT: .LBB0_2: # %bb2
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: jle .LBB0_4
				; CHECK-NEXT: # %bb.3: # %bb_dec
				; CHECK-NEXT: decl %eax
				; CHECK-NEXT: movl %eax, (%rdi)
				; CHECK-NEXT: popq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				%count = load i32, i32* %p, align 4
				%cmp0 = icmp eq i32 %count, 1
				br i1 %cmp0, label %bb_free, label %bb2

				bb2:
				%cmp1 = icmp sgt i32 %count, 1
				br i1 %cmp1, label %bb_dec, label %end

				bb_dec:
				%dec = add nsw i32 %count, -1
				store i32 %dec, i32* %p, align 4
				br label %end

				bb_free:
				call void @free_object()
				br label %end

				end:
				ret void
				}

				declare void @free_object()

llvm/test/CodeGen/X86/use-cr-result-of-dom-icmp-st.ll

	Show First 20 Lines • Show All 111 Lines • ▼ Show 20 Lines
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movq %rsi, %rax			; CHECK-NEXT: movq %rsi, %rax
	; CHECK-NEXT: movq %rdi, %rdx			; CHECK-NEXT: movq %rdi, %rdx
	; CHECK-NEXT: movl %eax, %ecx			; CHECK-NEXT: movl %eax, %ecx
	; CHECK-NEXT: shlq %cl, %rdx			; CHECK-NEXT: shlq %cl, %rdx
	; CHECK-NEXT: cmpq $1, %rdx			; CHECK-NEXT: cmpq $1, %rdx
	; CHECK-NEXT: jg .LBB3_2			; CHECK-NEXT: jg .LBB3_2
	; CHECK-NEXT: # %bb.1: # %if.end			; CHECK-NEXT: # %bb.1: # %if.end
	; CHECK-NEXT: testq %rdx, %rdx
	; CHECK-NEXT: movl $1, %ecx			; CHECK-NEXT: movl $1, %ecx
	; CHECK-NEXT: cmovleq %rcx, %rax			; CHECK-NEXT: cmovlq %rcx, %rax
	; CHECK-NEXT: imulq %rdi, %rax			; CHECK-NEXT: imulq %rdi, %rax
	; CHECK-NEXT: .LBB3_2: # %return			; CHECK-NEXT: .LBB3_2: # %return
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	entry:			entry:
	%shl = shl i64 %a, %b			%shl = shl i64 %a, %b
	%cmp = icmp sgt i64 %shl, 1			%cmp = icmp sgt i64 %shl, 1
	br i1 %cmp, label %return, label %if.end			br i1 %cmp, label %return, label %if.end

	▲ Show 20 Lines • Show All 120 Lines • ▼ Show 20 Lines

	define i64 @ll_a_1(i64 %a, i64 %b) {			define i64 @ll_a_1(i64 %a, i64 %b) {
	; CHECK-LABEL: ll_a_1:			; CHECK-LABEL: ll_a_1:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movq %rsi, %rax			; CHECK-NEXT: movq %rsi, %rax
	; CHECK-NEXT: cmpq $1, %rdi			; CHECK-NEXT: cmpq $1, %rdi
	; CHECK-NEXT: jg .LBB8_2			; CHECK-NEXT: jg .LBB8_2
	; CHECK-NEXT: # %bb.1: # %if.end			; CHECK-NEXT: # %bb.1: # %if.end
	; CHECK-NEXT: testq %rdi, %rdi
	; CHECK-NEXT: movl $1, %ecx			; CHECK-NEXT: movl $1, %ecx
	; CHECK-NEXT: cmovleq %rcx, %rax			; CHECK-NEXT: cmovlq %rcx, %rax
	; CHECK-NEXT: imulq %rdi, %rax			; CHECK-NEXT: imulq %rdi, %rax
	; CHECK-NEXT: .LBB8_2: # %return			; CHECK-NEXT: .LBB8_2: # %return
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	entry:			entry:
	%cmp = icmp sgt i64 %a, 1			%cmp = icmp sgt i64 %a, 1
	br i1 %cmp, label %return, label %if.end			br i1 %cmp, label %return, label %if.end

	if.end: ; preds = %entry			if.end: ; preds = %entry
	▲ Show 20 Lines • Show All 137 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: i_a_op_b_1:			; CHECK-LABEL: i_a_op_b_1:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl %esi, %ecx			; CHECK-NEXT: movl %esi, %ecx
	; CHECK-NEXT: movl %edi, %eax			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: shll %cl, %eax			; CHECK-NEXT: shll %cl, %eax
	; CHECK-NEXT: cmpl $1, %eax			; CHECK-NEXT: cmpl $1, %eax
	; CHECK-NEXT: jg .LBB13_2			; CHECK-NEXT: jg .LBB13_2
	; CHECK-NEXT: # %bb.1: # %if.end			; CHECK-NEXT: # %bb.1: # %if.end
	; CHECK-NEXT: testl %eax, %eax
	; CHECK-NEXT: movl $1, %eax			; CHECK-NEXT: movl $1, %eax
	; CHECK-NEXT: cmovlel %eax, %ecx			; CHECK-NEXT: cmovll %eax, %ecx
	; CHECK-NEXT: imull %edi, %ecx			; CHECK-NEXT: imull %edi, %ecx
	; CHECK-NEXT: .LBB13_2: # %return			; CHECK-NEXT: .LBB13_2: # %return
	; CHECK-NEXT: movslq %ecx, %rax			; CHECK-NEXT: movslq %ecx, %rax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	entry:			entry:
	%shl = shl i32 %a, %b			%shl = shl i32 %a, %b
	%cmp = icmp sgt i32 %shl, 1			%cmp = icmp sgt i32 %shl, 1
	br i1 %cmp, label %return, label %if.end			br i1 %cmp, label %return, label %if.end
	▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines
	}			}

	define i64 @i_a_1(i32 signext %a, i32 signext %b) {			define i64 @i_a_1(i32 signext %a, i32 signext %b) {
	; CHECK-LABEL: i_a_1:			; CHECK-LABEL: i_a_1:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: cmpl $1, %edi			; CHECK-NEXT: cmpl $1, %edi
	; CHECK-NEXT: jg .LBB18_2			; CHECK-NEXT: jg .LBB18_2
	; CHECK-NEXT: # %bb.1: # %if.end			; CHECK-NEXT: # %bb.1: # %if.end
	; CHECK-NEXT: testl %edi, %edi
	; CHECK-NEXT: movl $1, %eax			; CHECK-NEXT: movl $1, %eax
	; CHECK-NEXT: cmovlel %eax, %esi			; CHECK-NEXT: cmovll %eax, %esi
	; CHECK-NEXT: imull %edi, %esi			; CHECK-NEXT: imull %edi, %esi
	; CHECK-NEXT: .LBB18_2: # %return			; CHECK-NEXT: .LBB18_2: # %return
	; CHECK-NEXT: movslq %esi, %rax			; CHECK-NEXT: movslq %esi, %rax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	entry:			entry:
	%cmp = icmp sgt i32 %a, 1			%cmp = icmp sgt i32 %a, 1
	br i1 %cmp, label %return, label %if.end			br i1 %cmp, label %return, label %if.end

	Show All 39 Lines